
Indian Agricultural 
Research Institute, New Delhi. 


I. A. R. I. 6. 



9l, 


MOTT’C -ff I—C l AR/IJ7—8-4.IW -fi,000. 




PROCEEDINGS 

OF THK 

ROYAL SOCIETY OF EDINBURGH 




OF 


THE ROYAL SOCIETY 
OF EDINBURGH 

Section A (Mathematical and Physical Sciences) 

VOL, LXII 

I 943“ I 949 


PUBLISHED BY 
OLIVER & BOYD 
EDINBURGH: TWEEDDALE COURT 
LONDON: 98 GREAT RUSSELL STREET, W.Gt 

1949 





CONTENTS 


NO. 

1. The Future of Synthetic Plastics. By Professor H. W. Melville, F.R.S. (. Bruce-Preller 

Lecture delivered March I, 1943.) Issued separately November 17, 1943, . 

2. The Fundamental Concepts of Natural Philosophy. By Professor E. Al Milne, M.B.E., 

F.R.S. {James Scott Lecture delivered on May 3, 1943.) Issued separately November 

* 7 , .. 

*3. On the Matrix Representation of Complex Symbols. By D. E. Rutherford, M.A., B.Sc., 
D.Math., United College, University of St Andrews. Issued separately November 17, 
1943,. 

*4. A Note on Karl Pearson’s Selection Formulae. By D. N. Lawley, B.A., Moray House, 
University of Edinburgh. Communicated by Professor Godfrey H. Thomson, D.C.L., 
D.Sc., Ph.D. Issued separately November 17, 1943,. 

*5. On Whittaker’s Solution of Laplace’s Equation. By E. T. Copson, University College, 
Dundee, in the University of St Andrews. Issued separately March 7,1944, . 

*6. Atomic Wave Functions for Ground States of Elements Li to Ne. By W. E. Duncanson, 
M.Sc., Ph.D., University College of London, and C. A. Coulson, M.A., Ph.D., University 
College, Dundee. Issued separately March 7, 1944,. 

*7. Quantum Mechanics of Fields. L Pure Fields. By Professor Max Born, F.R.S., and 
H. W. Peng, Ph.D., University of Edinburgh. Issued separately March 7, 1944, 

*8. A Measurement of the Velocity of Light in Water. By R. A. Houstoun, M.A., D.Sc., 
F.Inst.P., Natural Philosophy Department, University of Glasgow. Issued separately 
March 7, 1944, .. 

9. On the Line-Geometry of the Riemann Tensor. By H. S. Ruse, University College, 
Southampton. Issued separately September 12, 1944,. 

*10. The Factorial Analysis of Multiple Item Tests. By D. N. Lawley, B.A., Moray House, 
University of Edinburgh. Communicated by Professor Godfrey H. Thomson. Issued 
separately September 12, 1944,. 

*11. The Identification of Klein’s Quartic. By W. L. Edge, M.A., Sc.D. Issued separately 

September 12, 1944,. 

*12. Quantum Mechanics of Fields. II. Statistics of Pure Fields. By Professor Max Bom, 
F.R.S., and H. W. Peng, Ph.D., Carnegie Research Fellow, University of Edinburgh. 
Issued separately September 22, 1944,. 

*13. A Problem in the Random Distribution of Particles, By Philip Eggleton, D.Sc., and 
William Ogilvie Kermack, D.Sc., LL.D., F.R.S. T From the Physiology Department, 
University of Edinburgh, and the Royal College of Physicians Laboratory, Edinburgh. 
(With Eight Text-figures.) Issued separately September 22, 1944* • 

*14. On Substitutional Equations. By D. E, Rutherford, M.A., B.Sc., D.Math., United College, 
University of St Andrews. Issued separately December 14, 1944, * * 

*15. Quantum Mechanics of Fields. III. Electromagnetic Field and Electron Field in Inter¬ 
action. By Professor Max Bom, F.R.S., and H. W. Peng, Ph.D., Carnegie Research 
Fellow, University of Edinburgh. Jssued separately December 14,1944, - * • 

*16. Studies in Practical Mathematics. IV. On Linear Approximation by Least Squares, By 
A. C. Aitken, D.Sc., F.R.S., Mathematical Institute, University of Edinburgh.: Issued 
separately October 24, 1945, . . , . . • \ • ’ v 

17. The Regraduation of Clocks in Spherically Symmetric Space-times of General Relativity. 
By G* C. McVittie, Ph.D. Issued separately November 5, 1945* - v V\ 


PACE 

I 

10 

28 

31 

37 

40 

5 * 

64 

74 

83 

92 

103 

117 

127 

138 

147 










VI 


Contents 


18. The Riemann Tensor in a Completely Harmonic V 4 . By H. S. Ruse, University College, 

Southampton. Issued separately November 5, 1945,. 

19. A Theory of Regraduation in General Relativity. By A. G. Walker, D.Sc., Department of 

Pure Mathematics, University of Liverpool. Communicated by Sir Edmund Whittaker, 

F.R.S. Issued separately March 21, 1946,.*64 

20. Evaluation and Application of Certain Ladder-Type Networks, By W, E. Bruges, M.Sc., 

A.C.G.I., D.I.C., Assoc.M.Inst.C.E. Communicated by Professor M. G. Say, Ph.D., 

M.Sc. (With Seven Text-figures.) Issued separately March 25,1946, • * * *75 

21. Tables of Chebyshev Polynomials, By C. W. Jones, M.Sc,, and J. C. P. Miller, Ph.D., of the 

University of Liverpool, and J. F. C. Conn, D.Sc., and R. C. Pankhurst, Ph.D., of the 
National Physical Laboratory. Communicated by Dr A. C. Aitken, F.R.S., . .187 

22. Two Numerical Applications of Chebyshev Polynomials, By J. C. P. Miller, Ph.D., 

University of Liverpool. Communicated by Dr A. C. Aitken, F.R.S. Issued separately 
October 25, 1946,..204 

23. The Number of the Elements. By N. Feather, Ph.D., Cavendish Laboratory, Cambridge. 

Communicated by Sir Edmund Whittaker, F.R.S. (With Three Text-figures.) ( Ritchie 
Lecture delivered March 1, 1945.) Issued separately October 25, 1946, , . ,211 

24. Time-Scales in Relativity. By A. G, Walker, M.A., D.Sc., Department of Pure Mathe¬ 

matics, University of Liverpool. Communicated by Sir Edmund Whittaker, F.R.S. 
Issued separately October 25, 1946, . . . . . . . . .221 

*25. Some Continuant Determinants arising in Physics and Chemistry. By D. E. Rutherford, 

M.A., B.SC., D.Math., United College, University of St Andrews. (With Four Text- 
figures.) Issued separately May 8, 1947, ........ 229 

*26. The Universal Integral Invariants of Hamiltonian Systems and Application to the Theory 
of Canonical Transformations. By Hwa-Chung Lee, Ph.D. (Edin,), Wuhan University, 
China. Communicated by Sir Edmund Whittaker, F.R.S. Issued separately May 8, 

1947,.237 

*27. Expansions of Lame Functions into Series of Legendre Functions. By A. Erd 61 yi, Mathe¬ 
matical Institute, The University, Edinburgh. Issued separately February 25, 1948, , 247 


*28. The Discriminant of a Certain Ternary Quartic. By W. L. Edge, M.A,, Sc.D. (Cantab.), 
Mathematical Institute, University of Edinburgh. Issued separately February 25,1948,. 

*29. On a Problem in Correlated Errors. By A. C. Aitken, D.Sc., F.R.S., Mathematical 
Institute, University of Edinburgh. Issued separately February 25, 1948, . 

30. On Hill’s Problems with Complex Parameters and a Real Periodic Function. By M. J. 0 . 
Strutt. Communicated by Sir Edmund Whittaker, F.R.S. (With Five Text-figures.) 
Issued separately May 12, 1948,. 

*31. Thermal Diffusion in some Aqueous Solutions. By Archibald C. Docherty and Mowbray 
Ritchie, Chemistry Department, University of Edinburgh. (With Five Text-figures.) 
Issued separately May 12, 1948, .. 

*32. An Elementary Treatment of Thermal Diffusion in Gaseous and Liquid Systems. By 
Mowbray Ritchie, Chemistry Department, University of Edinburgh. (With Three Text- 
figures.) Issued separately May 12, 1948, , . . • , 

33. Applications of Elliptic Functions to Wind Tunnel Interference. By L. M. Milne-Thomson. 
' Issued separately May 12, 1948, .. 


268 

m 

278 

297 

305 

316 


34* Foundations of Relativity: Parts I and II. By A. G. Walker, D.Sc., Department of 
' Mathematics, University of Sheffield. Communicated by Sir Edmund Whittaker, F.R.S. 
(With One Text-figure.) Issued separately May 12, 1948, . 






Contents 


vii 

35. Graphite Crystals and Crystallites. ‘ I. Binding Energies in Small Crystal Layers. By 

Mary Bradburn (Royal Holloway College, London University), C. A. Coulson (Wheat¬ 
stone Physics Laboratory, King’s College, London), and G. S. Rushbrooke (Chemistry 
Department, Leeds University). (With Two Text-figures.) Issued separately September 

22, 1948,.. 

36. Graphite Crystals and Crystallites. II. Energies of Mobile Electrons in Crystallites 

Infinite in One Direction. By C. A. Coulson (Wheatstone Physics Laboratory, King’s 
College, London) and G. S. Rushbrooke (Chemistry Department, Leeds University). 

(With Three Text-figures.) Issued separately September 22, 1948, .... 350 

37. The Van der Waals Force between a Proton and a Hydrogen Atom. II. Excited States. 

By C. A. Coulson (King’s College, London) and Miss C. M. Gillam (University College, 
Dundee). (With One Text-figure.) Issued separately September 22, 1948,. . . 360 

*38. On the Estimation of Many Statistical Parameters. By A. C. Aitken, D.Sc., F.R.S., 

Mathematical Institute, University of Edinburgh. Issued separately September 22,1948, 369 

*39. Transformations of Hypergeometric Functions of Two Variables. By A. Erddlyi, Mathe¬ 


matical Institute, The University, Edinburgh. Issued separately September 22, 1948, . 378 

*40. The Linear Difference-differential Equation with Constant Coefficients. By E. M. Wright, 

University of Aberdeen. Issued separately May 20, 1949,.387 

*41. Problems in Factor Analysis. By D. N. Lawley, M.A., D.Sc. Issued separately February 

3 t 1949 ; ... . 394 

42. The Nature of Scientific Philosophy. By Professor Herbert Dingle, D.I.C., A.R.C.S., D.Sc. 

Issued separately May 20, 1949,.400 

43. On the Gravitational Mass of a System of Particles. By G. L. Clark, Trinity College, 

Cambridge. Communicated by Sir Edmund Whittaker, F.R.S. Issued separately 
May 20, 1949,.412 

44. The Equivalence of the Gravitational and Invariant Mass of an Isolated Body at Rest. By 

G. L. Clark, Trinity College, Cambridge. Communicated by Sir Edmund Whittaker, 
F.R.S* Issued separately May 20, 1949,.424 

45. The Internal and External Fields of a Particle in a Gravitational Field. By G. L. Clark, 

Trinity College, Cambridge. Communicated by Sir Edmund Whittaker, F.R.S. Issued 
separately May 20, 1949,.427 

46. The Mechanics of Continuous Matter in the Relativity Theory. By G. L. Clark, Trinity 

College, Cambridge. Communicated by Sir Edmund Whittaker, F.R.S. Issued 
separately May 20, 1949,., 434 

*47. Non-Associative Arithmetics. By I. M. H, Etherington, University of Edinburgh. (With 

Three Text-figures.) Issued separately May 20, 1949,.442 


*48. On Commuting Matrices and Commutative Algebras. By D. E. Rutherford, M.A., B.Sc.,* 

Dr.Math., United College, University of St Andrews. Issued separately May 20, 1949, 454 

49. Generalizations of a Problem of Pillai. By L. Mirsky, Department of Mathematics, 
University of Sheffield. Communicated by Professor A. G. Walker. Issued separately 
May 24, 1949, . . . t .460 

*50, Quantum Theory of Rest-Masses. By M. Bom, F.R.S., and H. S. Green. With Appen¬ 
dices by K. C. Cheng and A. E. Rodriguez, Edinburgh University. (With Two Text- 
figures.) Issued separately May 24, 1949, * ,. 47 ° 

* The thanks of the Society are due to the Carnegie Trust for the Universities of Scotland for 
grants towards the cost of these papers. 









On Substitutional Equations 


XIV.— On Substitutional Equations. By D. E, Rutherford, M.A., B.Sc., D.Math., 
United College, University of St Andrews. 

(MS. received June 12, 1944. Revised MS. received July 24, 1944. Read July 3, 1944) 

1. It is well known that the n\ permutations e, o* 2 , . . a nl of n letters form a group 
which is usually denoted by S n . The permutation € denotes the identity permutation and 
is the unit of the group. A linear combination of these permutations with numerical co¬ 
efficients is called a substitutional expression, and any such expression will have the form 

L«/ 1 € + 4<T 2 + . . . + 4 , 0 ' n |. 

Since the sum and product of two substitutional expressions are themselves substitutional 
expressions, these substitutional expressions are the elements of an algebra called by Weyl 
(p. 79) the enveloping algebra of the group S n . It need hardly be pointed out that the basic 
elements a t of this algebra are linearly independent. Since the elements of this algebra may 
be regarded as operators acting on functions of the n permuted letters, we adopt the notation 
of Young and read products from right to left. 

The simplest type of substitutional equation is 

LX = o, 

where L is a given substitutional expression and X is an unknown one. Such equations were 
first studied by Young with a view to solving certain problems in the theory of invariants. We 
recapitulate some of his results in § 2. 

It has been shown by Young (1927, p, 265) and others that corresponding to each partition 
a of n a certain number (/ a ) 2 of substitutional expressions 

C , • * ♦>/“) 

can be constructed with the properties 

(1) 

or 

■ i ' . i- 

where S is the Kronecker function. There are in all n 1 of these expressions and they form a 
basis of the enveloping algebra. Let 

a=2<c 

art an oers 

where , 3 “, are numerical coefficients. If AB=C, then it can be deduced from (i) that 
TJXUb=Uq, where TJX, TJX, U<$ are respectively the matrices 

[*], [£]• 

Conversely, if UXTJb = US for every partition a of n, then it may be shown that AB = C. 
Further, for every partition a the matrix UJ is a unit matrix of order/”. 

2. Our first problem is that of solving the equation 

LX=o, (2) 

where L is known and X is unknown. If we wnte 

L= y/q.O'i) X = 

P.R.S.R.—VOL. LXII, A, 1944 )' PART H 


9 



D, E. Rutherford 


118 

where l e and are numerical, then 

v ^X “ 'yjo, x aPi (y 1 m X ?^ Qu<tr lX a, CT k‘ 
ij W 

Since the permutations or& are linearly independent, we conclude that the equations 

2 4 ^-^ 888 °, (A-i, (3) 

i 

must hold if LX=o. Evidently the number of linearly independent solutions of these 
equations depends upon the rank of the matrix A which has 4 k of % f° r h f Jth element. If its 
rank is n 1, the only solution of (3) is x <3j «= o for all/, and the only solution of (2) is X*»o. If, 
however, the rank A is less than n\, there are n\ - A linearly independent non-zero solutions of 
(3), and the most general solution of (2) is a linear combination of n ! - A independent solutions 
(Young, 1900, p. 104), 

A more fruitful method of attack is to express L and X in terms of the Let 

art art 

Then as a result of (2) we have 

U£U£=o 

. for every partition a of n. It is clear that the equations for the elements of any column of Ux 
are the same as those for any other column. Thus, if u\ denote a typical column, the equations 
for its elements are comprised in the single matrix equation 


U£*i-o. 


If the rank of Ux, is A°, then there are /* - A“ independent parameters in the most general 
solution for u\. In fact any solution u\ may be expressed in the form 



in which the are the/“ - A a independent solutions and the are a like number of arbitrary 
numerical coefficients. The coefficients will in general take different values for the different 
column vectors «&, but th« u\ t may be taken to be the same for all columns of U There are 
therefore in all /“(/“-A 0 ) arbitrary parameters in the most general solution of the matrix 
equation. It follows that the number of linearly independent solutions of the equation LX-o 
is (Young, 1927, p. 267) 

'Znr-n- ( 4 ) 


3. Before proceeding with the general case we consider the particular case where L is 
idempotent; that is, where L 2 = L. Since L is idempotent, so is the matrix U£; but it is 
known that the rank of an idempotent matrix is equal to its trace. Further, if L« ^ /<cr<, then 

tr. U£=2/*(tr. U^) = 2^ 

. i * i 

since the trace of U" 4 is‘simply the character component 5^. It follows that if L is idempotent 
{; the number of linearly independent solutions of LX = o is 


2 w-'#*>• 






• .gp# it is wefi known that and that £/% is equal to »\ or o according as o t 

^l^ nt^theuiatp^nurt^qn « (Litdewood, p. 46}, j This leads to the following result. 



On Substitutional Equations 119 

Theorem i. If L be an idempotent substitutional expression , the number of linearly 
independent solutions of LX = o is n\(i -4), where 4 is the coefficient of e in L when L is 
expressed in terms of the permutations . 

From this point onwards we shall accommodate the printer by dropping the upper suffix a 
wherever it occurs. This omission should present no difficulty to the reader. 

Since Ul is idempotent, it follows that 

Us = U e -U L = U € _ L 

is one solution of the equation 

U L Ux = o. , (5) 

Now if A be the rank of U L , the rank of U c _ L is just/- A. There are therefore/- A linearly 
independent columns in the matrix U e _ L , and this is exactly the number of linearly inde¬ 
pendent solutions of the equation 

Ui^x = o- 

The most general solution of this last equation is therefore 

^x = U e _i^Y> 

where u Y is an arbitrary column vector. Similarly, the most general solution of (5) is 

U x = U e _ L U Y , 

where U Y is an arbitrary matrix. Since this is true for every a we have the following result. 

Theorem 2. If L is idempotent , the most general solution of the equation IX ^o is 
X~ (e - 1 ) Y, where Y is an arbitrary substitutional expression , 

Although the solution exhibited in Theorem 2 involves n ! arbitrary parameters, there are 
nevertheless only n l(i - 4 ) linearly independent solutions. In the general case it is difficult 
to pick out this number of linearly independent ones from the n \ solutions given byTheorem 2. 
In certain special cases, however, which are of some importance n !(i - 4 ) independent solutions 
can be displayed, and these cases we now consider. 

4. If the permutations e, t 2 , . . r h form a subgroup of S w of order h> then 
r i (€+T 2 + . . . +Tft)«(€ + r a + • • • + r h)' It follows immediately from this that 

H = ^(€ + t 2 + . . . +T h ) 

is idempotent. The most general solution of the equation HX = o is therefore X-(€-H)Y, 
and so any such solution is a linear combination of the solutions 

(e-HK, (1-1, . . ,, n\). 

Theorem 1 informs us that of these only nl(h-i)/h are linearly independent, and it is our 
object to pick out this number from the n\ solutions just quoted. By’resolving S n into 
associate complexes with respect to the subgroup, we can find n \/h(~ k) permutations p t 
such that 

€ + 02+ • . . 4‘0 , n i = (e + r 2 + . . . +T h )(€+p 2 + . . . 

and it will be demonstrated that the n i (h - i)/k solutions 

* 

(e-H )r t p, (*'= 2, . . k\ j*=i, . . -,k) 

are linearly independent. There can be no relation between solutions involving different p h 
since distinct complexes have no elements in common. It is sufficient to show that no relation 

h 

i —2 

exists unless each numerical coefficient r\i vanishes. Such a relation gives 



120 


D. E. Rutherford 


it 

Comparing the coefficients of e on each side, we find that 2,^*0, an ^ so 


Since the permutations r i are linearly independent, we conclude that each 17* «o. 

Alternative independent solutions of the equation HX~o are 

k\ j** I, *. 

They are linearly independent since the permutations of S n are independent. Each obviously 
satisfies HI = o. 

Again, since H is idempotent, so is € - H. It follows that X * HY is the most general 
solution of 

(c-H)X-o. 

By Theorem x there are k independent solutions. These may be chosen to be 

which are clearly independent as each solution involves elements belonging exclusively to one 
complex, and because different solutions arise from different complexes. 

As a corollary to the foregoing, it is easily proved that if we write 

H'= “(€+S 2 Ta+ . . . +SfcT A ), 

where is -m or -1 according as r { is an even or an odd permutation, then H' is idempotent 
and the independent solutions of H'X = o may be taken to be 

v • (€-H0 hfTiPi, (s- 2 , . . h\ jmi 9 . . M k\ 

or, alternatively, 

(e-hiTi)p h (*« 2 , *)> 

while those of (e - H')X = o may be taken to be 

H 'pi, O’* i, . . . f k). 

5 « We return now to the general equation 


LX«o, 

Where L is no longer assumed to be idempotent. The expressions e, L, L 2 , . . . cannot all 
independent since they are each linear functions of the n I permutations of S w . Let 

f ; : v 

life the equation of least degree which is satisfied by L. It is called the minimum equation of 
L and<&,(*) is the minimum function of L. It follows from LX*o that L 2 X~o, L 8 X»o, 
v ., and hence <^(L)X-^ l (o)X, where </> L (o) is numerical. Since, however, ^(LJX-o, 
follows that the only solution of LX = o is X = o unless <j>z(o)^o 3 that is, unless fix^x) has 
Xt ^ be-proved in this section that if <j>zj(x) has a factor x\ i > i, we can first 
mplace the equation LX =*o by another equation MX«o which has the same solutions, 
such that has the factor x unrepeated. It will appear that the substitutional 
* n fact equal to AL, where A is a properly chosen substitutional expression 
FPpe^es m inverse A - ” 1 . This being so, LX=*o implies ALX-o, or MX-o, while 
implies A ^X-o, or LX = o. It is thus ensured that LX«o and MX = o have 
upe'ss^'’-solut|ons. * 

e^mtion <fa(x )=o is satisfied by the matrix U L for each partition a, and 
*re tow contains the reduced characteristic function of each of these matrices as a 
the L.C.M. of these reduced characteristic functions, as is easily 



121 


On Substitutional Equations 

proved. For, if 9 (x) is the L.C.M. in question, 9 (x) — o is satisfied by each matrix Ul and 
hence by the expression L. Thus 6 {x) contains <f> L (x) as a factor. On the other hand, ^(x) 
has 6 (x) as a factor, since the reduced characteristic function of each Ul is a factor of <£ L (#). 
It follows that = 9 (x) apart from an irrelevant numerical factor. 

Suppose now that 

u l ==h[p;q x ; ... + QJH- 1 , 


where the middle factor is the canonical form of U L , where P is the non-singular submatrix 
corresponding to all the non-zero latent roots of U L , and where Q* is a submatrix of order q % 
of the form 


O I o ... o' 
OOI ... o 


If we now write 


o o o 
.000 


1 

o. 


U A 3 H[I + R 1 + . . . +BJH- 1 , 


where I is a unit matrix of the same order as P and where R* is a submatrix of order q t of the 
form 


"o o 
1 o 
o 1 


o 

o 

o 


I" 

o 

o > 


Lo o 

then, by actual multiplication, 


1 oj 


U a U l = H[P + S 1 + . . . +SJH- 1 , 


where S* is a submatrix of order q % of the form 


000 

010 

001 


o’ 

o 

o 


Lo 0 o 


iJ 


Thus P + S x + . . . +S s is the canonical form of U a Ul, and each canonical submatrix 
associated with a zero latent root is of order unity. This means that any elementary divisor 
of the matrix corresponding to a zero latent root is linear* and hence the reduced 

characteristic function of U^Ux, involves the factor x to the first power only. Thus, if A is 
the substitutional expression represented by the matrices U^, the minimum function of AL, 
being the L.C.M. of the reduced characteristic functions of the matrices U^Ux,, contains 
the factor x to the first power only. We are led to the following conclusion. , 

Theorem 3. For any equation LX= o, where L does not possess an inverse , it is possible 
to find another equation MX — o with the same solutions and no others and such that 
the factor x to the first degree only . 

6 . Let the minimum equation of M be 

^ M (») 3 » o, (6) 

where tft(x) is prime to x. We may suppose that the constant term in ijj(x) is +1 without lack 
of generality. This being so, it is clear that e - ^(M) has a factor M. Since M satisfies 
equation ( 6 ), fit follows that 

{^(M)} 2 =^(M); 

that is, ^(M) and therefore e-^(M) are idempotents. Also, since € -i/r(M) has a factor M, 
the equation MX 5*0 implies {e - */r(M)}X = o. On the other hand, {e ~ 0 (M)}X = o implies 
M{e-^(M)}X»oj that is, MX-o, since M satisfies ( 6 ). The equations MX=o and 






J22 JO* R* Rutherford 

{e-i/r(M)}X therefore have the same solutions* Moreover, since e-^(M) is idempotcnt, 
the most general solution of these equations is X «*^(M)Y, where Y is an arbitrary expression, 
The following theorem has now been proved. 

Theorem 4. If the minimum equation of M is xift(x) * o, where ifj(x) ts prime to x t then 
where Y is arbitrary , is the most general solution of the equation J/X-o. 

It will be seen that a combination of Theorems 3 and 4 enable us to solve any equation 
LX—o. We first construct the expression M according to the method of § 5. When we 
have found the minimum equation of M, Theorem 4 gives the required solution. Neither A 
nor M is by any means unique, and indeed a suitable A can often be found by inspection. 
Theorem 3 gives a method for constructing an A when none can be found by inspection. We 
shall call MX=o th z prepared form of the equation LX = o. 

7. Successive applications of the methods described in the preceding sections enable us 
to solve a system of simultaneous equations 

M x X-o, L a X-o, . . L 0 X*o, (7) 

where we have indicated by the notation that the first equation has already been put into a 
prepared form. We assume that ^(sc), ■ •' <M*) each have k a factor x 9 as other¬ 

wise X - o is the only solution. The most general solution of M x X « o is 

X==?/r 1 (M 1 )Y 1 , 


where Y 1 is arbitrary. Y x is not arbitrary, however, if X also satisfies L a X»o. In such case 
Y x must satisfy the equation 

LrfiCM^-o. 

Suppose that the prepared form of this equation is M 2 Y X = o. Then 

Y^M^Y*, 

where Y a is arbitrary unless X satisfies a further equation L 3 X » o. Proceeding in this manner 
we eventually obtain the most general solution of the equations (6), It is 

X=«M a )^(M 2 ) . . . U M 0 )Y 0> 

where Y c is an arbitrary expression. 

8. We have seen that if an equation MX=o is in a prepared form, then the most general 
solution can be expressed as a function of M post-multiplied by an arbitrary expression Y. 
It is natural to ask whether this is always true even in the case of an unprepared equation 
LX=o. We shall show presently that this is not so. 

Consider first the case where the minimum equation of L is 


cf> L (x)^x i = o, (i < 1), 


Tt is at once obvious that X—L^Y where Y is arbitrary is a solution of LX»o, and indeed it 
is the most general solution which can be expressed as a function of L post-multiplied by an 
arbitrary expression. It is not, however, necessarily the most general solution of the equation 
LX=o. Since each U L satisfies the equation x i =o, all the latent roots of each U L are zero, 
and each elementary divisor of each U L must be of the form x j } where j < L Suppose those 
Of U&are aA, . . x?*; then U 1 * is similar to a direct sum of submatrices of the type Q* used 

in § 5 (MacDuffee, p. 73; Turnbull and Aitken, p. 69), and the orders j h of these submatrices 
; aane each less than or equal to u It follows that the rank of U& is 2 ,(/»- 0 or f~$. If 

'■ ■, fg 

^ ^ most general solution of LX —o, then the rank of (Ui,)*" 1 must be s* 

This can only be so ify* = . . . for each partition a (Turnbull and Aitken, p. 62). It is 

thus only in exceptional cases that X = L^” 1 Y is the most general solution of LX —o. 
f v. Suppose next that the minimum equation of L is 



is prime to ^ Functions rf(x) and £(x) can be found such that 
x^ix) , 



On Substitutional Equations 123 

•where £(*) is of degree i - x at most and does not contain the factor x. Now 

F(L) = LVL) = 6 - 0 (L)£(L) 

is idempotent, for' 

{F(L)} 2 = LVL){e-<A(L)£(L)} 

= L' 7? (L)-^ l (L)t ? (L)^(L) 

= F(L). 

Since F(L) has a factor L, every solution of LX = o satisfies F(L)X = o, and is therefore of the 
form 

X = {e - F(L)}Y=</r(L)£(L)Y. 


Not every solution of this form, 
equation 

which we now write 
where 


however, satisfies LX = o. It will only do so if Y satisfies the 
L^(L)£(L)Y = o, 

K.Y=0, 

K = L^(L)C(L)=L{e-F(L)}. 


Now since e - F(L) is idempotent, 


K 2 = L 2 {e-F(L)}, 
K 8 = L 3 {e - F(L)}, 


If <j>x(x) be the minimum function of K, then 

^k(K)=^(L){ € -F(L)} = o, 

and since e - F(L) does not have a factor L, <£ K (L) must contain a factor L *; that is, <fix( x ) 
contains a factor x i . On the other hand, 

K 1 = L i { e -F(L)}=LV(L)?(L)=o, 

and so 

<f>X.(x)=X i , 

which is the case already discussed. It follows that the most general solution of KY = o 
which can be expressed as a function of K post-multiplied by an arbitrary expression Z is 

Y = K i-1 Z = V - l {e - F(L)}Z. 

Substituting this result in the value for X we find that 

X = {e- F(L)}L‘ - - F(L)}Z 

= L <_1 {e-F(L)}Z 

=L i -V(L)^(L)Z. 

Since L i */r(L) = o, we can write this solution in the form 

X-L<“V(L)£(o)Z, 

or, including the numerical factor £(o) in the arbitrary Z, this solution can be written 

The fact that this is always a solution of LX = 0 could of course have been deduced immediately 
from the fact that ^(x) = xty(x), but the foregoing argument tells us more than this. It gives 
a criterion by which we can say whether or not it is the most general solution. It is the most 
general solution if and only if each elementary divisor of each Ug is x*. In ahy case it is .the 
most general solution which can be expressed as a function of L post-multiphed by an arbitrary - 
expression. This is clear from the fact that if we put , 

** o. 




124 D . E . Rutherford 

9. The methods which we have developed may also be used to solve substitutional 
equations of the form 

LX~R, 

where L and R are given substitutional expressions. 

If L possesses an inverse L"" 1 , the above equation has a unique solution 

X-L" l R. 

Suppose then that L does not possess an inverse. There must be at least one of the matrices 
Ux, which is singular and where minimal function has a factor x. It follows that j>j 4 (x) has a 
factor x. Using the same notation as in previous sections, we first construct the expression A 
such that A has an inverse and such that the minimum function of M(» AL) has a factor x 
unrepeated. Writing S = AR, our equation now takes the form 

MX-S, 

Since M*/r(M)=o, it follows that this equation has no solutions unless ^(M)S*>o. On the 
other hand, we can show that if \jj( M)S = o, then solutions of the equation MX«S certainly 
exist. Since the constant term in ijj(x) is +1, we can write 

0 (M)~€~M 0 (M), 

Now 

S-M 0 (M)S«^(M)S«o, 

from which it follows that X= 0 (M)S satisfies the equation MX — S. The following criterion 
has now been established. 

Theorem 5. The necessary and sufficient condition that a prepared equation MX***S 
should have a solution is that i/j(M)S — o. 

Let us now assume that the preceding condition is satisfied. Since 0 (M)S is one solution 
of the equation MX=S, any other solution X of this equation must satisfy the equation 

M(X- 0 (M)S)«o. 

It follows from Theorem 4 that the most general solution of this equation is 

X - 0 (M) S ** ift(M)Y } 

where Y is arbitrary. That is, any solution X of the equation MX**S can be expressed in 
the form 

X » 0 (M)S + 

as the sum of a particular solution 0 (M)S and a complementary function 0 (M)Y, The fact 
that X must consist of the sum of a particular solution and a complementary function was 
pointed out by Young (1900, p. 106), although he did not obtain the explicit solution derived 
above. The following result has now been achieved. 

. 4 Theorem 6 * V the minimum equation of M is x\jj{x) == x(i - x 9 (x ))« 0 and if the con¬ 
dition of Theorem 5 is satisfied, then the most general solution of the emotion MX**S is 
X— 6 (Jd)S + Y, where Y is arbitrary . 

^ We can also deal with the simultaneous equations 

LiX-Rx, L 2 X « R 2 , . . L 0 X«R C . 

If any one of these equations has no solution, the set cannot have a solution and so we need 
not proce^i further. If any L* has an inverse, then X = Lr% if a solution exists, but if 
L * Li ^ # for some h the equations are inconsistent and have no solution. We may 
suppose then that each equation has solutions and that no L t has an inverse. Let the prepared 
torm of L x X=R x be M x X = S x . Then 

X= 0 1 (M 1 )S 1 +^ 1 (M 1 )Y 1 . 

If this solution also satisfies L 2 X=R 2> we find that Y x must satisfy the equation 

L2*/'i(M 1 )Y 1 = R 2 — L 2 0 1 (M 1 )S 1 . 



On Substitutional Equations 125 

This equation may have no solution, or, if it has a solution, L^Mj) may have an inverse in 
which case we proceed as before. If neither of these possibilities is true, let 

M 2 Y x ~S 2 

be the prepared form of this equation. Then 

Y lS =0 2 (M 2 )S 2 + ^ 2 (M 2 )Y 2 

and 

X = fl x ( + U Mi){0 2 (M 2 )S 2 + 0 2 (M 2 )Y 2 }. 

Further equations L S X ® R 8 ,. . . imply further conditions which must be satisfied by Y 2 , . . ., 
but it is now clear that this process comes to an end after c steps and that the most general 
solution of the simultaneous equations is eventually achieved. 

10. We conclude by giving two simple illustrations of the foregoing theory in the case 
We shall solve the equations 

LX = o and LX = R, 

where 

L = (23) -(31) -(321)+ (123), 

R = € + (l2)-(23)-(l23). 

It is easily verified that L 2 = o, so —x 2 , and the equations are not in a prepared form. 
Take A = (23). Then 

M = AL = e - (12) + (31) - (123), 

S = AR = -€ + (23)-(31) +(321). 

Since we can verify that M 2 =3M, it follows that 

</>■&=x-$x 2 , ^(#) = r 6 =$. 

Now 

^r(M) - e - iM - J{2€ + (12) - (31) + (123)}, 

The most general solution of LX=»o is therefore 

X-«a« + <ia)-(3i) + (M3)>Y, 

and that of LX =■ R is 

X - ${ - e + (23) - (31) + (321)} + J{2e + (12) - (31) +(i23)}Y, 

where in each case Y is an arbitrary expression. 

Since the coefficient of € in ^(M) is f, the equations under consideration have 3! f, i.e* 
4 linearly independent solutions. These can be taken to be given by 

Y = e, (12), (23), (123) 

in turn. 

Summary 

Substitutional equations of the type considered by the late Alfred Young are shown to be 
intimately related with the theory of idempotents. Any equation LX®o possessing solutions 
other than X®0 is shown to have the same solutions as another equation MX®o, where M 
is obtained from L by premultiplying the latter by a suitably chosen expression A 
the minimum equation of M is xifj(pc) = o, *fi(x) being prime to x. The expression ^M) is then 
idempotent, and it is shown that the most general solution of LX=o is X=^r(M)Y, where Y 
is an arbitrary expression. The number of linearly independent solutions Of LX=^o is kn \ 3 
where k is the coefficient of the unit permutation in M) when that expression is expressed in 
terms of the permutations of the symmetric group S n . 

Corresponding results are obtained for t^e equation LX=R, and methods are given for 
solving sets of sipiultaneous equations ofboth types. 



126 


On Substitutional Equations 


REFERENCES TO LITERATURE 

Littlewood, D. E., 1940, The Theory of Group Characters , Oxford. 

MacDuffee, C. C., 1933. The Theory of Matrices , Berlin. 

Turnbull, H. W., and Aitken, A. C, 1932. The Theory of Canonical Matrices, London and 
Glasgow. 

Weyl, H., 1939. The Classical Groups , Princeton. 

Young, A., 1900. “On Quantitative Substitutional Analysis/ 5 Proc. London Math. . Soc. t XXXIII, 
97-146, 

-, 1927. “On Quantitative Substitutional Analysis 55 (third paper), Proc. London Math. Soc ,, (2), 

xxvm, 255-292. ‘ 


{Issued separately December 14, 1944) 



Quantum Mechanics of Fields 


127 


XV,— Quantum Mechanics of Fields. III. Electromagnetic Field and Electron 
Field in Interaction. By Professor Max Born, F.R.S., and H. W. Peng, 

Ph.D., Carnegie Research Fellow, University of Edinburgh. 

(MS. received June 29, 1944. Read October 30, 1944) 

Introduction 

Studying the interaction of different pure fields, we have been led to some essential^ 
modifications of the ideas on which our quantum mechanics of fields is based. We shall 
explain these here for the example of the interaction of the Maxwell and the Dirac field. 

In Part I * we showed that a pure field in a given volume Q can be described by considering 
the potentials and field components as matrices, not attached to single points in £i (as the theory 
of Heisenberg and Pauli), but to the whole volume. Further, we assumed the total energy 
and momentum to be the product of Q and the corresponding densities. In Part II f we 
showed that this conception has to be modified; the eigenvalues of the energy and momentum 
as defined in Part I represent neither the states of single particles nor of a system of particles, 
but of something intermediate which corresponds to the simple oscillators of Heisenberg- 
Pauli and which we have called afieirons. The total energy and momentum of the system 
is a sum over the contributions of an assembly of apeirons. Mathematically the differences 
of the quantum mechanics of a field from that of a set of mass points (as treated in ordinary 
quantum mechanics) is the fact that the matrices representing a field are reducible (while 
those representing co-ordinates of mass points are irreducible); each irreducible submatrix 
corresponds to an apeiron. 

The considerations of Part II make it obvious that the correct theory of quantised fields 
must be a much closer union of mechanics and statistics than we had anticipated in Part I. 

A second indication of the need for more precise definitions and modifications is now 
obtained from the consideration of the nature of the interaction terms in the Lagrangian. 
We have assumed in Part I that the Lagrangian characterising a system is given by the 
usual function of the field quantities taken from the classical theory. So long as one has to 
do with pure fields this leads to no ambiguity. In the classical theories of the photon, meson, 
or electron field the Lagrangian is of the second degree in the field quantities. If one 
introduces a Fourier transformation (transition to the momentum space) the new expressions 
are still of the second degree and essentially the same as the original ones. It is true, we have 
also considered the case of arbitrary functions (non-linear equations); but we cannot ascertain 
whether the general function assumed to be the Lagrangian has as arguments the field 
quantities in the ordinary space or in the Fourier space. Hence the general theory is formally 
correct but has very little concrete content. 

If we now consider the coupling of two pure fields we cannot avoid a decision about the 
function by which we represent the interaction. In the case (Maxwell plus Dirac)-field this 
function is of the 3rd degree in the space representation ; its Fourier transformed, also of the 
3rd degree, is a quite different and much more complicated function. Which of them is the 
correct one? If this question has been decided, the function chosen will then be considered 
as a matrix function and remains as such unchanged for every canonical (matrix) trans¬ 
formation. Hence if we have to choose the Fourier transformed function we would maintain 
that this rather complicated function is also the correct expression for the quantum theory 
of the fields in the space representation; the simpler classical expressions wpujd then bo only 
approximately true, in the sense of the correspondence principle. , : ’ ^ \y>. • , ||. 

Now we have a guide for this decision in the fact that the 
from the well-known awkward divergent terms, satisfactorily represents thefac^qfa^^|^ j 
* These Proceeding*, LXli, 1944* 40 . - \ , <; ; , f \f J 




128 


Max Born and H. W. Peng 

emission, scattering, etc. This theory is based on the Fourier representation (momentum 
space). Hence we have to consider this representation as fundamental, and we have to choose 
the Lagrangian correspondingly, replacing Fourier coefficients by matrices in the same way 
as in the first approach to quantum mechanics (Heisenberg-Born-Jordan). 

The whole argument which leads to a new and satisfactory theory for interacting fields 
is based on the correspondence postulate, and in order to make this clear it seemed to us 
advisable to go back to first principles. Therefore we begin with the classical form of the 
theory and introduce then the quantisation method of Heisenberg and Pauli. It is now an 
easy step to replace this semi-classical procedure by our new method, which is a generalised 
quantum mechanics using only non-commuting quantities. 

A main feature of this theory is the definition of the total energy and momentum by the 
traces of the matrices representing the densities; this is an obvious generalisation of the apeiron 
sums used in Part II. It is further necessary to generalise the commutation laws for the 
field quantities given in Part I in such a way that they include the commutation of any quantity 
with that which is produced from it by permuting the apeirons; in this manner the full 
correspondence with the Heisenberg-Pauli commutation laws is obtained. 

That the new theory is not less satisfactory than that of Heisenberg and Pauli is obvious 
from its derivation. The difference can be expressed by saying that the new theory admits 
an arbitrary apeiron distribution, while that of Heisenberg and Pauli assumes a uniform 
apeiron distribution (in momentum space). Hence all results not involving this distribution 
will be the same, while the divergent integrals produced by the uniform distribution become 
now convergent and may lead to new results. 

We think, however, that the new theory may have more far-reaching consequences con¬ 
cerning the connection of the ultimate particles. These difficult problems, which have to be 
considered in relation to the most general principles of quantum theory, will be postponed 
for another article. 


i. The Introduction of the Interaction between the Electron Field 
anb the Electromagnetic Field in general 


Without interaction, let the electromagnetic field be described in general by a Lagrangian 
L'(*^&) of the field strengths 


du h du a 
U ° h ~ 8 x,~ 8 x h > 


(!.X) 


and the electron field be described by a Lagrangian L" of the spinors v, v* and their derivatives 
v gy v*. The combined field, together with interaction, is then f described by the Lagrangian 

L~L>^+L>,**; ' (1.2) 


where only the definition of v g and v* is now altered so as to take account of the interaction; 





D ; 

d ie 
dx g foe? 0 ' 

( 1 - 3 ) 

For abbreviation let 

u - 8L - ; 

u th~a —> 

v*=— V* 

8 v’ 9 

_ 8 L 

8 v„ 


(m) 


and similarly for the conjugates. The variational equations with respect to u hy v* and v are 


('-5) 

-D fl V„+V=o, -D*V*+V* = o. (1.6) 


^£ 32 *^®***? f p ^f enfc L_ a generalisation of the introduction of the interaction with Maxwell’s field given 




Quantum Mechanics of Fields 
In order that (1.5) may be integrable the charge-current vector 


129 


must satisfy the equation of continuity, which is by (1,3) and (1.6) 

This is the case if the Lagrangian is invariant under the general gauge transformation 


v ve iy , u g —► u g 


and hence v g v g e %y , 


where y is an arbitrary real function of the co-ordinates. Then (1.8) is verified by varying 
v, Vg and their conjugates according to (1.9) and demanding the total variation of L with 
respect to y to vanish. 

The energy-momentum tensor T gU for the combined field including the interaction can 
also be partitioned into two parts in a gauge-invariant way: 

= + T^, (1.10) 

^■gh = u gi^hi ~ L'S^, (l-Il) 

t^X)+(M) - l \ a . (1.12) 

The divergence of the part t; a which may be attributed to the electromagnetic field is, by 
using the cyclic divergence equations resulting from (i.i) and in virtue of the source term 

of (1.5), 

The divergence of the part T" ft which may be attributed to the electrons in the presence of 
the electromagnetic field is, by using the identity 




and (1.6) and (1.4), 


■g^-<D*V V») + (Vv Dfo) + <*,V*) + (Vv*) 


V? - - V ft -i 


By using (1.8) the 0 / 0 *, here can be replaced by D„ where it operates on v and v h , and by D* 
where it operates on v* and v*. By using (1.13) and cancelling terms (1.15) becomes 


■ (JhP. ~ V?) + (V ft , DJ»; - D>*). 


By using (1.3) again and noting that, by (1.1),. 


d a d,-d,d a =£« a „ d:d;-d;d a *= --U hq , 


(1.16) becomes, by using also (1.7), 


- m = -u^u gh s h . 



130 Max Born and H. W. Peng 

This verifies the Lorentz law of force in general. The sum of (1.13) and (1.18) yields the 
conservation laws for the energy and momentum 


STgj 

dx n 


=0. 


(1.19) 


2, Change of Notation for the Passage from the Lagrangian Formalism 
to the Hamiltonian Formalism. Maxwell’s Field and Dirac’s Field 

In order to prepare the classical theory of § 1 for quantisation by the method of Heisenberg 
and Pauli or by the new method, it is necessary to pass from the Lagrangian formalism to 
the Hamiltonian formalism by treating the time differently from the spatial co-ordinates. 
It is then appropriate to use the space-vector notation. 

We specialise to Maxwell’s field and Dirac’s field in interaction described by the 
Lagrangian 

L = lu gh u gh + ~{(v a a g v*) - (va a v*)} + mc\vpv*), (2.1) 


where a 2 , a 3 and j8 are Dirac’s matrices, but a i -i = V -x in consequence of our use of 
pseudo-Euclidian metric. In space-vector notation let 


(x 1? ^3) ~~ 5 

Oi, *4)“A, 

(^23? U ZV > ^12) 5=5 (^41) ^42> ^43) ” L E ? 

(a 1; a 2 , a 3 ) = a, a 4 = (,; 

( s i> s 2 , s Q ) =j, s i~ L p l 

/ \ r ie a ( 1 & ie t \ 

( Vl ,v» Vs) =f +w Av, j 


(2.2) 


Among the components of the energy-momentum tensor we shall need only T oi , which by 
integration over a volume Q give the total momentum p and energy E contained in £}, as 
follows;— 

f C^i4? T 2 4, T 3 4)dr=cep } j* T^dr= — E. (2.3) 

J £2 J Cl 

The equations of § i will now be specialised for the Lagrangian (2.1) and written in the 
notation (2.2) and (2.3), as far as they will be needed in the following sections. They can 
be ordered in four groups: 

(a) Equations which are merely definitions of the densities of current and charge, and 
the total momentum and energy: 

j=-e(vav*), p=-e(vv*), (2.4), (2.5) 

p =L(-> aH - l A P - (vi (2.6) 

„ f /E 2 + H 2 . Kc \ 

E—j ( - A.j -(va.i*)} +mc 2 (vfiv*) jdr. (2,7) 

( 8 ) The field equations in space: 

gradw=f, curl A = H, div E =/>, (2.8), (2.9), (2.10) 

(c) The field equations in time: 

1 SA 1 8E 

"ca7 =E+grad & (2.11), (2.12) 

ft Sv ftc 

~i ~&t~ ~ ev Y Jr ~J^- a,Jreva ’- A + mc ' iv fi- (2.13) 



Quantum Mechanics of Fields 131 

(d) The equation of continuity which is a consequence of (2.4), (2.5), 2.8), and (2.13): 

. 1 dp 

dlvJ+ ; S 7 =0 - (2 - 14 ) 


In addition, in order to avoid the arbitrary gradient which can be added to the vector 
potential A according to the gauge transformation (1.9), it is practical to adopt the restriction 

divA = o (2.15) 

for any time t. This restriction fixes the electromagnetic potentials and leaves only a constant 
phase factor undetermined for the electron field. The scalar potential is then determined 
by the density of charge according to Poisson’s equation 

div grad </>= -p, (2.16) 

which follows easily from (2.15), 2.11), and (2,10). 


3. Separation of the Electromagnetic Field into a Transversal and a 
Longitudinal Part in the Wave-Vector Representation 

It is a well-known theorem that any vector field can be decomposed into a divergence-free 
part plus a curl-free part. For the electromagnetic field H is, by (2.9), divergence-free and 
A is, by (2.15), chosen to be so. E is not divergence-free, but its divergence is determined, 
by (2.10) and (2.5), by the fundamental variables of the electron field. 

Hence as the fundamental variables of the total field we shall take those of the electron 
field plus those of the divergence-free part of the electromagnetic field, because the curl-free 
part of the electromagnetic field can be expressed in terms of the variables of the electron 
field. This can be done explicitly by resolving all field variables into their Fourier 
coefficients. 

We enclose the field in a rectangular box and expand all the field components into three- 
dimensional Fourier series, assuming the usual periodic boundary conditions. For the 
electron field the field components are complex quantities. Let f 

<r) = 2 v aj , f*(r) = 2 f *V iLr , etc. (3.1) 

l l 

The wave-vector 1 covers the whole reciprocal space. For the electromagnetic held the field 
components are real. Therefore f we write 

A(r) = 2 ( A / 7 k ' r + Ke -*- 1 ), etc. (3.2) 

k 

Here the wave-vector k covers only half the reciprocal space in order that all the Fourier 
coefficients A ft ’s and A*’ s may be independent. Then the Fourier coefficients of the 
divergence-free part of any vector field are perpendicular to the wave-vectors and will be 
called transversal, while those of the curl-free part are parallel to the wave-vectors and will 
be called longitudinal. 

By inserting the Fourier series for E and p in (2.10) and equating the coefficients of 
exp (zk.r) on both sides we get 

ik.E k =p k . (3.3) 

Thus the longitudinal part of E fc is, by projection along the direction parallel to k, 

E Mong =k^= -*kg. (3.4) 

The transversal part of E fc is therefore 

E& } tr. ~ E& E#, long. — E& + 2 ^“Pfc/^ 2 * (3* S) 


f Unlike the usual practice, no normalisation factor is introduced in the Fourier analysis. The Fourier 
coefficients of a quantity are then of the same physical dimension as that of the quantity itself. 



i-j2 Max Born and H. W. Peng 

We shall, however, regard this equation as the definition of E k in terms of E*, tr . and take 
the latter as a fundamental variable. In fact, as will be shown immediately, the field equations 
for the electromagnetic field can be expressed in terms of the variables of the transversal 
part alone. 

The Fourier coefficients of the scalar potential are determined by virtue of (2.16), by the 
electron field 

( 3 - 6 ) 

By this and (3.5), (2.xx) yields simply 

~=E,+fk^=E s+ fk| c =E,, t , (3.7) 

From (2.12) follows the vector equation 

\ ^=z-kAH 7t -j fc , (3.8) 


but the longitudinal part of this is, by (3.3), nothing but the equation of continuity 


1 fy* 

c dt 


-z'k.j to 


( 3 - 9 ) 


which involves only the variables of the electron field, and is a consequence of the field 
equations for the latter. Heisree (3.8) is, by (3.9), equivalent to 

- —^r- = «k A H s -j s +k-^ r . (3.10) 

The other variables, H ft and A fo are entirely transversal. Either may be taken as a funda¬ 
mental variable, the other can then be expressed in terms of it. By (2.9), H 7c can be expressed 
in terms of A k thus * 

Hjc=fkAA fc . (3.11) 

Conversely, this can be solved for A s (because the latter is transversal) and yields 

A J .=fkAH i( .]&. (3.12) 

After inserting in (2.6) the Fourier series for the field quantities the integration gives for the 
total momentum 

p=£l( 2 ,~{ E fc A H* + E*aH^- A k p* - A*p k } + {(fi v*) -(vif*)})- (3-13) 

X * 21 l j 

By (3.12) and (3.5) this becomes the sum of contributions by the transversal electromagnetic 
field and the electron field without any interaction term, 

p=12^2 ( -{E S> tr.AH* + E* tr.AH; c } + ~(3-H) 

However, the total energy E still contains interaction terms: 

E fc+H*. Hj +p k ptlk z - A ft .jt - Aj. jj} 

+ • av *) ~ ( v i a ■ f *)} + me *2 (3.15) 

1 l / 

The equations (2.4), (2.5), (2.8), and (2.13), which are mainly concerned with the electron 
field, become, in terms of the Fourier coefficients, 

Pk - (3.16), 3.17) 

i'fc, , *, ‘ <S-'» 

i & + + mc 2 v t fi t (3^9) 

* k 



Quantum Mechanics of Fields 

The total charge q contained in the volume Cl is, by integrating (2.5), 

q= -eClXM- 

I 

4. Quantisation of Maxwell’s Field and Dirac’s Field in Interaction 
by the Method of Heisenberg and Pauli 

(a) Simple Treatment 

A simple and practical way of applying the method of Heisenberg and Pauli for the 
quantisation of a field—in our case the combined field of Maxwell and Dirac—is the following: 
after the field quantities have been resolved into their Fourier coefficients (what we have done 
in § 3), the field is treated as an assembly of oscillators characterised by the wave-vectors. 
Then the field equations in time (§ 2 (c)) give rise to the equations of motion of the oscillators, 
viz. in our case (3.7), (3.10), (3.19) and their adjoint equations. The canonical variables, 
which can be read off from the time-derivative terms of these equations, are the following 
conjugate pairs:— 

*> l > ) A 7 c, tr. 5 tr., Afe * 

The equations which arise from space-derivatives are, according to this view, to be regarded 
partly as definitions of some auxiliary variables (namely, (3.11) as the definition of H fc and 
(3.18) of fj) and partly as constraints among the above canonical variables (namely, 

zk.A*~o, ik.E^fe-o (4.1) 

and their adjoint equations). Quantisation consists in considering all the canonically conjugate 
pairs of variables as ^-numbers which satisfy simple commutation or anti-commutation laws, 
as follows. 

For the spinor components (s denoting the spinor index) anti-commutation laws hold f 

1 » 5 + = V U V* S + VuVu = I/Q. (4-2) 

All other anti-commutators vanish. 

For the vector components let the axes be chosen so that k lies on the 2-axis. The x- and 
jy-components of A 7c and E* tr< satisfy the commutation laws 

[Afcg, E#^ tr,l 3 tr.-^kx “ tr. = ^HcjCl, [Aftf? trJ 1=5 2 HcjCl , (h x = k v — o). (4-3) 

All other commutators vanish. 

By an arbitrary rotation of the co-ordinate axes (4.3) becomes in general J 

. (4.4) 

All the vector components of Ab E^tr. and A* commute with all the spinor com¬ 

ponents of v % and z/f. 

The quantised equations of motion are 

§- "|[E, F], F=A S , E* tr .; E i>tr , A* ; Vl} »*; (4.5) 

where the total energy E of the assembly of oscillators is given by the ^-number expression § 
of (3.15). The close formal analogy between the classical and quantum equations of motion 
can be demonstrated by working out the commutators of (4.5). If F is a component of A&, 
in (3.15) only the term does not commute with A*. By (4.4) and (4.x) the 

t Jordan and Wigner, Zeits. f. Fhysik, xlvii, 1928, 631. 

X Novobatzky, loc . cit . 

§ Since the order of factors can be arbitrarily changed in a r-number expression but not so in a ^-number 
expression the ^-number expression of (3.15) (as well as that of (3.14)? etc.) is slightly ambiguous. This has 
no effect on the anti-commutation laws or commutation laws. The zero-point momentum and charge, however, 
can be avoided by taking the mean of the two possible ^-number expressions as discussed in Part I. 

P.R.S.E.—YOL. LXII, A, 1944-45, PART II 10 


*33 

(3-20) 



*34 

commutator has the value 


Max Born and H . W. Peng 


[E, A J == ( 4 * 6 ) 

Hence (4.5) with F=A fc is formally identical with (3.7). If F«E^, tr. the relevant term of 
(3,15) can be written, using the adjoint of (3.11), in the form 

H fc .Hj - Aj .j&“ (zkAH*.). A k - A k .j/ c * ( 4 * 7 ) 


With the help of the adjoint of (4.4), (4*5) with F«E* |tr . is seen to be formally identical with 
(3.10). Making use of (3.6), (3.16), (3.17), then adjoint equations and (4.2) the 

quantised equation of motion (4.5) for v x is formally identical with (3.19), if the order of the 
factors <j) and v in the latter equation is properly adjusted, 

Instead of working with the Heisenberg representation (4.5) one can consider the variables 
of (4.5) as operators satisfying (4.2) and (4.4) and describe the field (which is here treated as 
an assembly of oscillators) by the Schroedinger wave function X F. X F contains, besides the 
independent variable /, the variables on which the operators of (4.5) act. One then replaces 
(4.5) by the wave equation 


i dt 


= -EY, 


( 4 * 3 ) 


where the Hamiltonian E is the operator expression of (3.15). 


(b) Complete Treatment 

In the complete and rigorous application of Heisenberg and Pauli’s method for the 
quantisation of a field the equations (2.8), (2.9), (2.10), and (2.15), where only space derivatives 
appear, are to be treated on the same footing as the equations containing time-derivatives. 
The quantised equations of motion in space supplement those in time (4,5), namely 

i 

grad F = ^[p, F]; F = F(r), the field variables at the point r. (4,9) 

p denotes the total momentum of the field contained in the volume £X If the Fourier series 
(3.1) and (3.2) for the field variables are used and their Fourier coefficients considered as 
^-numbers (the r in field theories is a dr-number which may be compared to the time / in the 
quantum mechanics of particles) (4.9) becomes 

[p? F z ]=$ 1 F h F z -* v h i t ; [p, F ft ] = #kF&, F k = A&, H^, j/ c , p k . (4**0) 

One has now to add to the canonical variables considered in the simple treatment the 
quantities f z , H* and their adjoints as fundamental field variables. The following additional 
commutation and anti-commutation laws containing the additional field variables have to 
hold and to be considered as fundamental as those given above, (4.2) and (4.4); 

[f V ls\+ — ~~ \£lsj V l$\+ = t V&} [f \l8X}f *= 1% 4/ii ; ( 4 . 11 ) 

•h'fcy, tr.] ~ tr.] == [E^^, H&J — — [Efo^tr.j 555 ftokJO,, (4,12) 

It is understood that the vector indices #, y> z in (4.12) may be cyclically permuted. With 
p given by the ^-number expression of (3.14), (4.10) follows directly from the totality dflg 
commutation and anti-commutation laws if one chooses F l ^v l , f x and F & ~A*, E Mr ., H*, 
By using the definition (3.16) and (3.17), and also (4.10) with Fj^Oj together with the adjoint 
equation, one sees that (4.9) holds for F*.=j & and p k ; for instance 

[P, P *1 = V l+kl v l) + (»l+fc[P, »!*])} 

I 

= -e^{n(l+k)(v l+k vt)~(v M vt)ni}==hk Pk . (4.13) 

By (4.2) and (4.11) the combination f u ~^ ls anti-commutes with all the field variables 
of the electron field and commutes with those of the electromagnetic field. Hence it vanishes. 
Om) and (4.10) the combinations H fc -*kAA fo and z k. A k commute with all 



Quantum Mechanics of Fields 135 

field variables. Hence these expressions commute with the total momentum and the total 
energy of the field; therefore they vanish in virtue of (4.9) and properly chosen boundary 
conditions, or in virtue of (4.5) and proper initial conditions. 

Thus in the complete treatment (3.18), (3.11), and (4.1) as ^-number equations result 
from the commutation and anti-commutation laws. In virtue of (4.10) and (4.9) these 
equations can be written in a way formally identical with the classical equations (2.8), (2.9), 
{2.10), and (2.15). 

The demonstration of the formal identity between the classical and quantum equation 
of motion in time can now be carried out in the same manner as in the simple treatment. 

It follows from (4.10) and its adjoint that the components of the momentum and the 
energy all commute with each other. 

etc., [p, E]=o. (4.14) 

In the Schroedinger representation (4.5) and (4.9) are to be replaced by the equations 

l~dt = -E ' F > 7 SradT=pT, (4,5) 

where E and p are the operator expressions of (3.15) and (3.14). Because of (4.14) the wave 
equations (4.15) are compatible. 

5. New Method of Quantisation 

The Heisenberg-Pauli method developed in the last section can be considered as semi- 
classical; while it uses the classical representation of the field variables by Fourier series (3.1) 
and (3.2) which are ordinary functions of the position vector r, it considers the Fourier 
coefficients not as functions of time, but as ^-numbers. Now we proceed to a method 
of complete quantisation in which space and time are treated on the same footing. 
We discard the Fourier series as a sum of terms and replace for each field variable the set 
of its Fourier coefficients by an array of elements which form a more complicated matrix f 
than that needed to represent the individual ^-number Fourier coefficients by matrices in the 
Heisenberg and Pauli quantisation. The total matrix belongs to the volume Q as a whole 
and contains, as will soon appear, the totality of information about the field variable it 
represents throughout this volume. The only non-vanishing anti-commutation and com¬ 
mutation laws for the total matrices are those given in Part I, § 5 and § 6 (equations (5.13), 


(5.18), and (6.28)); they are the following (s denoting the spinor index):— 

[»«»»*]+=iA (s-i) 

[fj, ®,]+—- fsxify\+ ~ (S- 2 ) 
[H3, E* te .]= -[H v , E* tr.] = [Ey.tr., - - [E*. Hj] ss HdJQ, ( 5 . 3 ) 

where x } y , z may be cyclically permuted; 

ilkc 

[Ag., Ey } tr.] ” [A *, F y> trj “ {$xy — (fix + &y + &z) ^h x k^. (5*4) 


(5.2) and (5.3) contain also the definitions of the new self-adjoint variables 1 and k thus intro¬ 
duced. These variables, like all the field variables, are represented by total matrices and are 
not to be confused with the ^-number wave-vectors used in § 3 or § 4 for Fourier analysis. 
As shown in Part I, § 6, the three components of 1 commute with all the field variables of 
the electron field by virtue of the anti-commutation laws (5.1) and (5.2) only. By (5.2) again 
they commute among themselves. In the present case, because the field variables of the 
•electron field all commute with those of the transversal electromagnetic field, l x , 4, and l z 
•commute also with all variables of the transversal electromagnetic field. Similarly, as a 
-consequence of (5.3) and (5.4), k v , and k z commute with all field variables, among them¬ 
selves and also with 4, l yf l z . 

f Such a matrix will be referred to in what follows as a “total matrix.” 



136 Max Born and H . W. Peng 

Since all the field variables commute with k X9 k v , k Zi 4> 43 an< ^ 4 the set °f matrices which 
represent them is reducible. In the representation where k X} k y9 k z , 4> 4? an d 4 are 
simultaneously diagonal the total matrices for all the field variables will appear as being 
composed of submatrices placed along the diagonal. The representation is the dzrect 
product of the two representations for the two pure fields considered in Part I, § 5 and § 6, 
separately. In this representation we have 

rr ^ k’S/fc'j At> W, tc't— JWl'Sn* ? (5*5) 

F kT, rr — ^k'^k'k^iT for F = H, Etr.? j, p and their adjoints; (5*6) 

F&t, k'T — $k'k*Fv$n" for ¥ = v,f and their adjoints; (5*7) 

where the submatrices H x , k' y , l' x , l’ y , and t 9 are scalar (i.e. a number multiplied by a unit 
matrix) and the submatrices F fc > or F^ will be shown to be the same as those matrices 
used in the Heisenberg and Pauli quantisation to represent the corresponding ^-number 
Fourier coefficients. 

As in the Heisenberg and Pauli quantisation where it is convenient to treat the field as 
an assembly of oscillators each of which is described by the Fourier coefficients belonging 
to a wave-vector, so in the new method of quantisation it is convenient to treat the field as 
an assembly of apeirons each of which is described by the submatrices belonging to an eigen¬ 
value of the total matrices k or 1 introduced by (5.3) and (5.2). Yet in the new method of 

quantisation the eigenvalue k' or V needs only to assume a selection of the possible values of 
the wave-vector, while all of them are automatically included in the Heisenberg and Pauli 
quantisation where the wave-vector is introduced by means of Fourier analysis. 

The non-vanishing anti-commutation laws and commutation laws obtained by taking the 
diagonal submatrices kT, kT of (5.1), (5.2), (5.3), and (5.4) coincide exactly with those of 
Heisenberg and Pauli’s Fourier coefficients, namely (4.2), (4.10), (4.11), and (4.4), except 
that now k r and V are written in place of the former k and /. The vanishing anti-commutation 
and commutation brackets obtained by taking the non-diagonal submatrices kT, kT" (either 
k'^k" or V+\* or both) of (5.1), (5.2), (5.3), and (5.4) are trivial identities because no field 
quantity, by virtue of its being reducible, contains any non-vanishing non-diagonal 
submatrices. 

In order to obtain the full correspondence of the commutation laws and anti-commutation 
laws for the submatrices with those for the Fourier coefficients we have to supplement (5.1) by 

[PzgP* z£l + =o, (5.8) 

where P denotes any permutation matrix permuting all the /'-apeirons. It is sufficient, 

however, to take P to be the set of cyclic permutations. (5.2), (5.3), and (5.4) are to be 

supplemented in a similar way. 

Let I denote the unit matrix 8 ^ and trace^ A denote the sum S*A W . Then the 
operation I trace^ produces a scalar matrix of the same number of rows and columns as the 
matrix on which it operates. Let J trace< 3) have a similar significance where J denotes the 
unit matrix Corresponding to (3.14) and (3.15), which express the total momentum 
and energy as summations over the Fourier coefficients, we use now the summations over the 
submatrices, namely 

p=Q^I trace (s) {E te aH* +E* aH} +|j trace (l) {(fo*) - (5.9) 

E =a(l tnw» ( „{E fc .E* + H. H* + 9 {k% +*J + ty-'p* - A . j* - A* . j} 

(5-10) 

The density p and current j are now defined by 

3® trace (l )(CtC # a£/*), *~ej trace^C^C*®*), (5.11), (5.12) 


+ trace a) {(f.az> # ) ~(z/a.f*)}-f^ 2 J trace^^/to*)), 



137 


Quantum Mechanics of Fields 

where the matrix C satisfies the following conditions:— 

kC = Ck, Cl-lC-kC, CC* = i. ( 5 . 13 ) 

It follows from (5.13) that Cw % yy vanishes except if k'=k" and l"-r=k' when it is of 
modulus unity. Hence we have 

(CvC*))n* =8 (S- 1 ^ 

and the correspondence of (5.11) and (5.12) with (3.16) and (3.17) is apparent. 

The total charge is 

-e£lj trsLce a) (vv*). (5.15) 

The correspondence of the submatrices of the new method of quantisation with the 
Fourier coefficients of the Heisenberg and Pauli quantisation is so close that the demonstration 
of the field equations given in § 4 for the Fourier coefficients may be now taken over for the 
submatrices with corresponding formal change, which is too obvious to be repeated here. 

In the Schroedinger representation (4.15) holds with E and p now considered as the 
operator expression of (5.10) and (5.9). 

Since the total matrices representing k and 1 commute with those representing the energy 
E and the momentum p, k and 1 are constants of motion in time and space. That is, the 
distribution of both the electromagnetic and the electronic apeirons (which is described by 
the values k', k", etc., and 1', V } etc., that actually occur as submatrices of k and 1, cf \ Part II) 
remains the same throughout space and time in spite of the interaction between Maxwell’s 
field and Dirac’s field. The interaction affects only the number of quanta occupying the 
apeirons, the quanta being electrons and photons respectively. Hence our attempt made 
in Part II to determine the apeiron distribution by statistical considerations cannot be based 
on the theory in its present form, which is very likely only provisional. Then there may be 
another way, also mentioned in Part II, to determine this distribution, namely by studying 
its effect on the self-energies of the quanta and the transition probabilities of collision processes. 


(Issued separately December 14, 1944) 



A . C. Aitken 


138 


XVI.—Studies in Practical Mathematics. IV. On Linear Approximation by 
Least Squares. By A. C. Aitken, D.Sc., F.R.S., Mathematical Institute, 
University of Edinburgh. 

(MS. received December 8, 1944. Read February 5, 1945 ) 

1. Introductory 

R, Frisch, in a paper (Frisch, 1928) on correlation and scatter in statistical variables, made 
an extensive use of matrices, and in particular of the moment matrix , as he called it, of a 
set of variables. The matrices were square arrays, with an equal number of rows and 
columns. This paper of Frisch pointed the way to an even more extensive use of the algebra 
of matrices in problems of statistics. 

What Frisch called the moment matrix may perhaps be more suitably called, nowadays, 
the variance matrix of a set or vector of variates, since the moments in question are all 
variances or covariances. In the present paper, which is illustrative of matrix methods, 
we explore the familiar ground of linear approximation by Least Squares, making full use of 
the properties of the variance matrix. We also study the linear transformations that convert 
crude data into smoothed or graduated values, or into residuals, or into coefficients in a 
linear representation by chosen functions. 

The problems (i) of obtaining by Least Squares the best solution of a set of inconsistent 
linear equations 

a il x 1 + a i2 x 2 + . . . +a in x n ~u i9 2 = 1, 2, . . ., m> n , (1) 

where the u { are subject to error, and (ii) of representing a set of data u(x t ) by Least Squares 
in the form 

X* j ) = ao+fl L 2> 1 (*i)+a 2 A(*<)+ • • • + a kPk(Xi), *-i, 2, . . .,n, n >k + i, (2) 

where the j> 3 -(x t ) are values of k prescribed functions, are in essence the same, though 
differences of notation are apt to conceal this fact and to perplex the beginner. In (i) the 
a i5 are known constants, the are the observations affected by error, the Xj are the unknowns. 
In (2) the functional values 2 >j(x % ) are known constants, being for example polynomial, or 
harmonic, or arbitrary, but in every case belonging to some prescribed basis of k 4-1 linearly 
independent functions i, J> x (x), p 2 ( x )> • • *, ; the u(x t ) are the observations over the 

set of values of x , the coefficients aj are the unknowns. The values u(x) may be imagined 
as plotted ordinates, the y(x) as corresponding ordinates on the approximating or graduated 
curve. 

In the notation of matrices the observational equations in (i) are Ax~u and in (2) are 
Pa=u. Here A is & matrix [a i3 ], x and u are column vectors, while P is a matrix [&(#*)], 
and a is a column vector of elements a jt We shall first consider Ax — u, always observing 
that by a simple change of lettering we have analogues for the case Pa = u. 

2. The Variance Matrix and Fundamental Lemmas 

Let x be a column vector having n variates x i as its elements. Let the mean value of 
each Xi be taken as origin, so that the x t are deviations from means. Then xx' is a symmetric 
matrix [x { x^\ of order nxn and rank 1. Since the mean value of x t x^ is the product moment 
or covariance p^o^, where is the correlation coefficient of x$ and x iy we construct the J 
mean value of the matrix [xx% namely 

E{XX') = [pijViCTj] « V (1) 

and call it the variance matrix V of the x iy or simply of the vector x. In general V is positive 
• definite, but in certain cases it may be non-negative definite of less than full rank. 



Studies in Practical Mathematics . IV. On Linear Approximation by Least Squares 139 

Lemma 1.—Let the x t be linearly transformed to new variates y 5 by y~Hx, where H is 
in general a rectangular matrix. Then the variance matrix of y is 

EiHxx'H') =HVH r . (2) 

Thus V is transformed like the matrix of a quadratic form, except that H and H f appear 
in reversed order. This simple Lemma is of great value. 

We make use also of the important 

Lemma 2.—Let B be a variable rectangular matrix with fewer rows than columns, let P 
and A be given matrices such that BA =P, and let V be a positive definite symmetric matrix 
of such an order that A' V~ X A can be constructed. Then the trace (sum of diagonal elements) 
of B VB r is a minimum (or, alternatively, the diagonal elements of B VB f attain independently 
their minimum values) provided that 

B=P(A' V~ X A)- 1 A' V~\ (3) 

This was proved by the author in an earlier paper (Aitken, 1934). 

By Lemma 2 we may treat the inconsistent equations Ax = u, postulating that the optimal 
value of each x j is a consistent linear function of the u if of minimum variance. Let us at this 
stage generalize the problem by supposing that the u i are not uncorrelated but have variance 
matrix V. We have then to express x in the form Bu, and so the variance matrix of x is 
B VB The diagonal elements of B VB' are thus each to be made a minimum subject to 
the condition of consistency, and this condition, since Ax = u gives Bu—BAx , must be that 
BA —I. Hence, by Lemma 2, 

B = (A' V-'Ay^A' F~\ (4) 

and so the vector of solutions # is obtained by solving 

A'V-iAx = A'V~ 1 u. (5) 

These are indeed, in matrix notation, the normal equations derived from correlated 
observations. They also give the vector # which makes the positive definite quadratic form 
(u - Ax)' V~\u - Ax) a minimum. We shall call this the residual quadratic, since u~Ax 
is the vector of residuals. For observations of unit weight and uncorrelated we have V=I, 
and the normal equations take the simple shape A'Ax^A'u. For observations uncorrelated 
and of weights = the equations are A'WAx = A'Wu , where W is a diagonal matrix 
with elements in the diagonal. 

The Variance Matrix of the Solutions. —By Lemma 1 the variance matrix of the 
solutions x is 

(A f V~ 1 A)~ 1 A' V- 1 . V. V~ l A(A' V^A )- 1 = (A’ V^A)~\ (6) 

This result, that the variance matrix of the solutions is the reciprocal of the matrix of the 
normal equations, contains the classic result of Gauss on the weights of the solutions, a result 
that in modern notation would be expressed thus: the weights of the solutions (in the 
uncorrelated case) are the reciprocals of the diagonal elements of (A'WA)* 1 . 

The Variance Matrix of the Residuals .—The vector of residuals is 

u-Ax-{I-A( 4 ’ V^AY^A V-'}u, (7) 

and so by Lemma 1 the variance matrix of the residuals is 

{I-A(A f F-M)-M' V- 1 } V{I-A(A' V-*jy*A* V~ 1 }' = V- A(A' V^A)^A\ (8) 

For uncorrelated observations all of unit weight it is 

I-A(A'Ay'A'. (9) 

In this latter case we may easily find the mean value of the sum of the squared residuals. 
For if the variance matrix just given is T, we note at once that T^—T, so that T is idempotent 
and satisfies the reduced characteristic or minimal equation A 2 -A=o. Its latent roots are 
thus exclusively 1 and o. But A (A'Ay 1 A' is also idempotent, and of rank n. Hence its 
trace, being equal to the sum of its latent roots, is n, and so the trace of T is m—n. It 



I40 A. C, Ait ken 

follows that if each u is of variance a 1 , then the mean value of the sum of squared residuals 
is (i m - n)a 2 . If cr a is to be estimated from the sum of squared residuals, then m-n is the 

divisor to use. This is a classical result, first proved by Gauss. 

The Residual Quadratic.— The residual quadratic itself, namely (ft-Ax) V~\u - Ax), 
can be expressed in a variety of ways, for by referring to 

A' V~ 1 Ax=A' V~ x u 

we may transform it to 

u'V~ l (u — Ax), or u'V-'u-x'A'V-'Ax, (io) 

or 

v' V~ x u - u' V~ X A(A' V-'AY'A' V~ x u> (i i) 


The uncorrelated cases of these are obtained by putting V**I, and are well known, though 
differently expressed, in the classical literature of Least Squares. The last form given above 
can be written as a quotient of two symmetric determinants and can be expanded in a series 
of some interest. We take V=f. Since a quadratic form involving a reciprocal matrix 
can be written as the quotient of a bordered determinant by the cofactor of its leading element, 
we have here 


u'u-u'A(A'A)- x A'u = 

u'u 

a!u 

b'u 

u'a 

a' a 

Va 

u’b . . 

a'b . . 

b'b . . 

. . u'k 
. . a r h 

. . Vh 


a' a 

b'a 

a'b . 

b'b . . 

. . a'h 

. . b'h 


h'u 

h!a 

h'b . . 

. . h!h 


h'a 

h'b . , 

. . h'h 


where a, b, . . ., h denote the successive columns of A . (In the notation of Gauss the 
elements of the determinants would appear as [uu], [au], [bu\, [aa], [ab] and the like.) Such 
a quotient of determinants may be expanded (cf. Aitken, 1942) as a Schweinsian series, 
giving the sum of squared residuals as 







u'a u'b u'c 


u'a 

u'b 

2 


a' a a'b a'c 

t , o ' a ) 2 

a'a 

a'b 



b'a 

b'b b'c 

0 - I4r bv m 

x.a a 

a! a 

a'a 

a'b I 

a'a 

a'b 


a'a a'b 



b'a 

b'b 1 

b'a 

b’b 


b'a b'b 


c' a c'b c f c 


(* 3 ) 


the series terminating after n 4-1 terms. It will appear in the sequel that the terms after the 
first, with negative sign, represent decrements in the sum of squared residuals, produced 
pari passu with a set of orthogonal vector increments by which the vector of residuals is 
built up. But this will be more in place in the discussion of Pa = u. 

The residual quadratic in the correlated case can be expressed in the same way as a 
quotient of determinants and can be expanded as a Schweinsian series, differing from (12) 
merely in that u'u becomes u' V~ l u, and similarly for other elements. But the increments 
of the vector of residuals are then not orthogonal, and the matter becomes somewhat 
academic and remote from practice. 


3. The Curve-fitting Problem 

We turn now to the observational equations in what may be termed the curve-fitting 
problem, Pa=u. We are interested here, certainly in the coefficients a$ of the representation, 
but equally in the graduated values y—Pa, In this case the vector a is given by the normal 
equations 


and so 


P'V-'Pa^P'r-'u, 
y=P(P' V- x Py x P' V” X U = Gu y 


(1) 

(a) 





Studies in Practical Mathematics . IV. On Linear Approximation by Least Squares 141 

where G may be called the graduating matrix . Though the fitting of correlated data has 
been studied (Aitken, 1933) it seldom arises in practice, and is difficult. Accordingly we 
confine ourselves to the cases V=I and V — W~ x . When V-I, the commonly occurring 
case of equally weighted data, the graduating matrix is 

G=P(P'P)-'P', (3) 

a matrix obviously symmetric, idempotent and of rank h + i. The sum of squared residuals, 
by § 2 (10), is u'u-a’ P* Pa. If the*data u are spaced at equal intervals in x , as is often the 
case, they can be graduated in reversed order, and so, since G is independent of u , we see 
that G is unaltered when its rows and columns are reversed. Thus G is in this case symmetric 
about both its diagonals. Again, from its rank and idempotency, its trace is k + i. We 
have thus various useful checks upon its evaluation. 

For a polynomial basis, or for a basis of harmonic functions of the usual kind, these 
matrices G are not difficult to construct for moderate values of n and k . Especially is this 
the case when the basis of functions is orthogonal and normal. Let the functions 1, p x {x), 
p%(x), . . ,,pk(x) be transformed into a set of functions qj(x) orthonormal over the values of x y 
by linear combination 


gj(x)=c 0j + c lj p 1 (x)+c 2j p 2 (x)+ . . . +C jj pj(x), j=0, I, . . .,k. (4) 

In other words, Q — PC , where C is a matrix of triangular shape, all elements above the 
diagonal being zero. From the orthonormality of the q^x) we then have Q'Q—I. Also 
QQ = G y since 

P(P'P)- 1 P'-PQC'P'PQ-'C'P' = QiQ'Q^Q' - QQ. (5) 


If the coefficients in such an orthonormal- representation are a^ we have the sum of squared 
residuals given as u’u - a'ay a standard result. 

As an example, consider the construction of a graduating matrix G for n -6 and a 
polynomial basis with k = 3, in fact for fitting cubic polynomial values to 6 equally spaced 
data. We take the four columns of values, for k~ o, 1, 2, 3 from some table of orthogonal 
polynomials ( e.g . Fisher and Yates, 1943), normalize them if necessary, and thus we have 
Qy and so QQ The actual matrices in this case are 



1 -5 5 -S 

r* -1 

x -3 -1 7 


1/V6 

" 3 - 

1 

W 

1 

M 


i/V 70 

i 1-4-4 


I/V84 

1 3-1-7 


i/V 180 

1 5 5 




■QQ'=—z 

120 


121 

16 

- 14 

- 4 

11 

- 4 

l6 

73 

52 

2 

-28 

11 

14 

52 

58 

32 

2 

- 4 

4 

2 

32 

58 

52 

- 14 

11 

-28 

2 

52 

73 

16 

4 

11 

- 4 

-14 

16 

121_ 


Or again, for fitting a harmonic function 

y = a Q + a x cos x + a 2 cos 2x 
+ b x sin x + b% sin 2X 


( 6 ) 


( 7 ) 


( 8 ) 


to 12 data u X y equally spaced over one complete oscillation of the angular variable, we take 
for the columns of Q the values of 1, cos x, sin x, cos 2x, sin ax at the phases o°, 30°, 6o°, . . 
330°, normalize these columns and so construct 


QQ circ [5 2 + V3 1 -1 ~i 2-V3 1 2-V3 -1 -1 1 2 + V3], (9) 



142 A . C. Aitken 

where circ [. . .] is used to denote a symmetric circulant matrix, completely determined by 
its first row, successive later rows being written down by cyclically permuting the elements 
of the first row. This circulant property could have been deduced at once from the con¬ 
sideration that a cycle of arbitrary periodic data can be graduated in any one ot its cyclic 
orders. 

These graduating matrices G , evaluable once and for all, and applicable to any proposed 
set of n data, are of considerable practical use, quite apart from their interesting theoretical 
properties. Tables of G for polynomials up to 72 = 15, ^**5 f° rm t ^ e *^PP cn ^ x °f a thesis 
by F. Mary Harding (Edinburgh, 1934); and tables of G for the harmonic fitting ot 2 n data, 
^ = 2, 3, 4, 5, 6 , 8, 12 and k= 1, 2, . . n occur in the Appendix of a thesis by A. F. Buchan 
(Edinburgh, 1939). 

In the case of unequal weights, V = W~\ we may graduate similarly by weighted ortho¬ 
gonal functions such that Q f WQ=*I 9 the graduating matrix being then QQ'W* For example, 
if the weights followed a binomial distribution, the polynomials to use in polynomial fitting 
would be the orthogonal polynomials of Gram, 


4. The Coefficients of Terms in the Representation 
The coefficients a } in the representation < 

jr = tf o + 0iA(*)+ - • * + &kM x ) 00 

are found by the transformation a = (P'P)-*P'u 9 but here again it is of the greatest advantage 
to choose if possible the orthonormal representation 


+ «ift(*)+ . . . +«*?*(*), 




for then a = (Q'Q)- l Q'u=*Q'tt, a familiar and central result in orthogonal representation. 
If a good calculating machine is available we can use Q' directly upon the data u. For example 
to find the Fourier coefficients for 6 data u x over one oscillation, we have 



I I 

I I 

I 

1 


2 I 

-I -2 

- I , 

X 

Q’-i 

• vi 

2 - I 

V3 

I 2 

-Vi -Vi 

- I - I 


* 

-V3 . 

V3 - V 3 


I - I 

I -1 

I 

-* J 


( 3 ) 


which is merely a matrix of values of cos kx and sin kx suitably normalized. It is our 
experience that the use of such matrices with the machine is much more rapid than any of 
the computing schemes of harmonic analysis. 


5. The Orthogonal Components of the Graduated Vector 

The graduated vector y=*P(P'P)- l P'u can be expressed as the sum of orthogonal com¬ 
ponent vectors. Each of these is at the same time an increment of the graduated vector and 
a decrement of the vector of residuals. 

Let P s denote the matrix comprised by the first y +1 columns of P. Then 


y,-PAP;Pir*P/v, (x) 

where is the vector of values graduated in accordance with the partial or truncated 
representation 


y 2 (x) = a 0 + ctiPi(x) + . . . +a i p j (x). (2) 

Let/ take the values o, x, 2, . . k in succession. Each vector^ may then be regarded as 
derived from its predecessor by the incremental vector 




(3) 



Studies m Practical Mathematics . IV. On Linear Approximation by Least Squares 143 

It is easy to see that these incremental vectors are always orthogonal to each other—that is, 
that their scalar product is o. For suppose the basis transformed as in § 3 to an orthogonal 
basis with matrix Q , then in a manner similar to § 3 (5), with C replaced by its leading sub¬ 
matrix of order j x j, we have 

Pji-P/P^P/— QjQ/, /=o, 1, . . k. (4) 

Thus the incremental vectors are (< 2 i< 2 / - Qi-iQS-dv; but these produce the individual 
terms a$q$(x) of the orthonormal representation; since it is well known, and almost intuitive, 
that to graduate u(x) by means of a basis obtained from a previous orthogonal basis by 
annexing a further orthogonal function q 5 (x) is merely to add a term a#fa) to the existing 
representation. The orthogonality of the incremental vectors is thus established. 

These orthogonalities lead to many interesting properties of graduating matrices. For 
example, since 

{P^P/P^P/-P^P/. AiVV-J* and {PIPIP^P! - (s) 

are orthogonal for i < j, and u is arbitrary, we deduce that 
Wp/pj-ip; - - PUPi'-iPi-d-'Pi'-J 

^{Piip/p^p; - pup/^p^p/^mip/p^p; - PUPi'-xP^Pi'-i) (6) 

= 0. 

In particular we deduce 

aia{a^a;.aia;a^a^aia;a^a; . aiau^ai-aiaia^ai, i<j , (?) 

where we have written A instead of P to emphasize the fact that the result is a general theorem 
of matrices, A being an arbitrary matrix having linearly independent columns, and A 5 being 
the sub-matrix composed of the first j+i of those columns. 

Translated into the language of fitting by Least Squares such an identity as the above 
has a very simple interpretation. To take an example, it means that if we graduate a set 
of data by fitting values of a quintic polynomial to them, and then, treating the graduated 
values as new data, graduate them in turn by fitting a cubic polynomial, we obtain exactly, 
the same final values as if we had graduated the original data directly by a cubic polynomial. 
Such results are evident at once if set out in terms of an orthogonal basis. They have their 
use, however, in the numerical verification of elements of the graduating matrices G. Thus 
the graduating matrices G 2 and G z for fitting harmonics up to the second and third respectively 
to 8 data are the symmetric circulant matrices 

< 7 2 = -Jcirc [5 1+V2 -1 1-V2 1 1-V2 — 1 1+V2] ( 8 ) 

and 

circ [7 .i-ii-ii-ix]. (9) 

It is readily verified here that G Z G 3 — G 3 G 2 = G 2 * 

We return now to the Schweinsian series of § 2 for the sum of squared residuals. A 
Schweinsian series (cf. Aitken, 1942) for the quotient of two determinants, the numerator 
being obtained by bordering the denominator, arises thus. Form the following sequence:, 
first, o; second, the leading element in the numerator; third, the leading minor of order 2 
in the numerator divided by the leading element in the denominator; fourth, the leading 
minor of order 3 in the numerator divided by the leading minor of order 2 in the denominator; 
and so on. The first differences of this sequence may be proved (loc. cit.) to be equal to the 
respective terms of the Schweinsian series. Now in our present problem the leading sub¬ 
matrices of A f A or of P'P, as the case may be, are A 0 'A 0 , A{A X , ... or P 0 'P 0 , PxP l9 • . . 
These are the matrices of successive normal equations in x or in a, of higher and higher order. 
Thus we see that the decrements in the sum of squared residuals, as given by successive- 
terms in the Schweinsian series, correspond precisely to the orthogonal component vectors 
by which the graduated vector is augmented as j increases step by step to k. It follows from 
the invariance of P(P'P)~ rl P' that every term of such a Schweinsian expansion is invariant 



I44 A. C . Aitken 

under a transformation Q—PC of the basis, where C is a matrix of triangular type with all 
elements above the diagonal equal to zero; and this, whether the new basis Q is orthogonal 
. or not. In fact the (J + 2)th term of the expansion is equal to - <2/, where a s is the (J + i)th 
coefficient in the orthonormal representation. 

These and many other properties of Least Square representation were first established 
by Tchebychef (Tchebychef, 1858). From the present point of view they arise as immediate 
consequences of the relations P{P ! P)~ 1 P' — QQ f , Pj(P '/PjY^P/ ™ QjQ/y where Q is 
orthogonal. 


6. Least Squares under Exact Linear Conditions 

We turn now to a different topic, the solution of a linear problem in Least Squares subject 
to extraneous restrictions, which we shall assume to be linear. For example, the estimates 
of the three angles of a measured plane triangle must have the sum 180 0 . In the curve-fitting 
problem, also, it might conceivably be the case that the coefficients a$ of the representation 
had to satisfy some linear condition or conditions. 

The customary procedure (cf. Brunt, 1931, p. 127) is direct and summary: it is, to invoke 
a principle of conditioned Least Squares, and to determine the conditioned minimum by 
introducing as many Lagrange multipliers as may be required. But it is of some interest 
to arrive at this procedure from a different approach, always from the standpoint of consistent 
linear representation and minimum variance. 

The restrictions themselves are, of their very nature, the results of prior experiment and 
measurement. For example, the fact that the three angles of a plane triangle sum to 180 0 , 
when regarded from the point of view of mensuration and not from that of deductive 
geometry, is the conclusion arrived at, and then not finally, from an immense number of 
measurements of such triangles. The linear restrictions are to be viewed, therefore, as 
equivalent to so many more observational equations, of large (and ultimately indefinitely 
large) weights w s . In the sequel we shall make -+ °o , and so for convenience we may regard 
all the w s as equaLto w\ further, by “preparing” the linear conditions in the usual way by 
multiplying by Vw we can treat them as all of weight 1. 

Thus in the non-correlated case the solution consists in minimizing the quadratic form 

(s u - Ax)\u - Ax) + w(k ~ Cx) r (k - Cx ), (r) 

where Cx^k are the s linear restrictions. Let us suppose that the solutions of these normal 
equations lead to Cx—k + €. Then the solving vector x is of course the same as would be 
obtained by minimizing ( u - Ax)'(u - Ax), subject to Cx-k + €. If now w~+ 00 , then € o, 
and the desired values x are obtained by minimizing (u - Ax)'(u - Ax) subject to Cx~k. 
Here we make contact with the usual direct approach; a vector A of Lagrange multipliers 
can be introduced, and by minimizing 

* i(u - Ax)'(u - Ax) - (Cx -k)'\ (2) 

we derive the normal equations 

A'Ax^A'h + C' A, Cx=k, (3) 

these being n+s equations in the n unknowns x and s unknowns A. 

The above may be called the A-method. Actually the method of very large weights, 
which may be called the ^-method, is of some practical value. Suppose, for example, that 
we are satisfied with an accuracy of 1 in 1000 in our residuals. We might then take 

10,000, and introducing the linear conditions as observations having this weight we could 
solve for n unknowns, not n+s, in the ordinary way. We choose the weight just so large 

at the error induced by not proceeding to the limit, w -> 00 , may be negligible in the number 
of significant digits retained. A practical disadvantage is perhaps that the matrix of the 
normal equations is sensitive to small relative changes in its elements. 

By regarding exact conditions as observations of indefinitely great weight we obtain at 
once the mean value of the sum of squared residuals as {m+s-n)a\ since there are m+s 
observations in n unknowns. This is of course well known from another point of view, 



Studies in Practical Mathematics . IV. On Linear Approximation by Least Squares 145 

namely that fiom linearly independent equations we can express s of the unknowns in terms 
of the remaining n — s and so, entering the results in each observational equation, we have m 
observational equations in n —s unknowns, whence the mean value of the sum of squared 
residuals is {m - (n - s)}a 2 . 

It is of interest at this point to work out a very simple example. It is one in which the 
restrictive condition makes a conspicuous difference in the solutions. 

Example .—Observational equations: 


M 

to 

M 

_1 

1 

’ 13 * 

2X2 


*1 


l6 

I 2 



= 

13 

2 I 


_#3_ 


13 

III 


10 


Exact linear condition: 

2x x + 3^2+^3 = 20. 

Normal equations: 


“6+422/ 5 +622/ 6 + 2 W~~ 


x x 


“ 55 + 4022 /“ 

5+620 11+gw 9 + 322; 



= 

91 +6022; 

_6 + 2W 9 + 322; 11+22; 


_ #3 _ 


_94 + 202e; 


Solutions by reciprocal matrix : 


“*i~ 

T 

40 + 5622; 

-1-3522; -21-722;*“ 


“55 + 4022/“ 

#2 

X 

I09+23X22; 

- 1 - 35 ^ 

- 21 - 7Z2; 

30 + 2622; -24-822; 

-24-822; 41 + 3822; __ 


91 + 6022; 
__94 + 2022/_ 


x 

109 + 23122; 


135 + 357 ^ 
419 +96922; 
515+99922; J. 


Thus the solutions with and without the linear condition are respectively 


( 6 ) 


~ x{~ 

T 

“ 357 “ 


_ I ' 545 ~ 

x 2 

231 

969 

= 

4-195 

JO 

_ 999 _ 


_ 4 " 3 2 5 _ 


*1 

T 

“* 35 ” 


“i- 239 _ 

#2 

X 

109 

419 

- 

3-844 




_ 4 ‘ 7 2 5 _ 


Now let us try the moderate value w = 100. The normal equations are 


406 605 206 




4055“" 

605 9 n 309 


x 2 


6091 

_2o6 309 IIX_ 


_X 3 _ 


__ 2094 _ 


The solutions by the reciprocal matrix are 


x x 


5640 

-3501 

- 721 

x 2 

I 

-3501 

2630 

- 824 

23209 


— ~ 7 21 

- 824 

384 i_ 


” 4055 “ 

T 

“ 35835 


“ 1 * 544 “' 

6091 

X 

97319 


4*193 

23209 


_ 2094 _ 

_ioo4is_ 


__ 4 * 3 2 7 _ 


These differ only in the third place of decimals from the accurate solutions, and they are 
such that 2X X + sx 2 = 19-994. Indeed they are stated with excessive accuracy, for the 
size of the alteration produced in the solutions by imposing the linear condition indicates 
relatively large observational error; but the example is purely illustrative. ' 

It may be noted, in (5) and (6) above, that in the determinant | A'A +ze)C'C |, in the 
elements of the adjugate, adj (A'A+wC'C), and in {&d](A'A + wC'C)}(A'u + wC'h) no 
higher powers of w than the first have occurred. In the general case of s restrictive conditions 
the highest power of w that appears in this determinant and matrices is w s . This is a simple 
consequence of the fact that C'C and C' are of rank so that in the expansion of any minor 



146 Studies in Practical Mathematics . IV. On Linear Approximation by Least Squares 

of the augmented matrix [A'A +wC'C: A’u + wC'k\ in powers of w no higher power than 
w s can occur. The practical bearing of this, as we have mentioned above, is that the dominant 
part of the matrix is of less than full rank, and so the matrices in question are sensitive to 
small relative changes in their elements, and more significant digits have to be retained in 
the arithmetical workings. 


7. The Relation between the A-Method and the ^-Method 

It is of interest to set out the relationship of the A-method to the ze/~method. The A-method 
leads to the normal equations, in partitioned matrix form, 


~A'A C 1 V] \A'u 

_c . J LaJ ~ L k \> 


(*) 


while the ze/-method leads to the normal equations 


(A'A +wC'C)x=A'u+wC'k. 

Now equations (1) may be regarded as the limit, when w °o, of 


Premultiplying both sides by 

we obtain 


that is to say, 
together with 


~A\ 


[A'A C f 1 Txl TA’u] 

Lc -«n/J LaJ - L ^ J. 

[ j r\ 

H xl \A'u+wC'k~\ 

A J “ L h J’ 


A+wC'C 

C - w~ x I 


(A'A+wC'C)x=A'u + wC'k 
Cx=k+w~ x A. 


(2) 

( 3 ) 


( 4 ) 

(5) 


In a word, the normal equations for x as given by the A-method under the restrictions 
Cx = k + e are the normal equations of the ze/-method, provided that e = nj~ x X. When w -*■ 00 
the restricting conditions tend to Cx=k and the two methods tend to equivalence. 


REFERENCES TO LITERATURE 

Aitiusn, A. C., 1933 “On Fitting Polynomials to Data with Weighted and Correlated Errors,” 
Jrroc. Roy . ooc. Earn., LIV, 12-16. 

o ^ east Squares 2nd Linear Combination of Observations,” Proc. Roy . Soc. Edin .. 
LV, 42-48. J ' 

i^NT 9 n‘ an * Matrices Oliver & Boyd, Edinburgh, 2 nd ed„ pp. 107-109. 

BRUNT, D., 1931* The Combination of Observations , Cambridge, 2nd ed., pp. 75-128. 

BUC ofEdhi*burgh 939 ’ IJn * ar Combination °f Data with Least Error , Thesis for Ph.b., University 
** 3 ^^ ***** Medical 

FEKC ^^ G ^’ 3 S 2 , eip.Xjt 011 ^ SCattW b StatUtiCal VariableS ’’ Nordhk 

HA ^of Ethnbmgh 934 " Square Smoothi ”i ft linear Combination, Thesis for Ph.D., University 
Tchebychef, P. L., 1858. Journ . de Math,, 2nd ser., in, 289, 320. 


(Issued separately October 24, 1945) 



Regraduation of Clocks in Spherically Symmetric Space-times of General Relativity 


147 


XVII.— The Regraduation of Clocks in Spherically Symmetric Space-times 
of General Relativity. By G. C. McVittie, Ph.D. 

(MS. received October 7, 1944. Revised MS. received February 2, 1945. Read May 7, 1945) 


1. Introduction 

In an earlier paper in these Proceedings * I was able to show that the principle of the 
regraduation of clocks, combined with the definitions of coordinates and of velocity and 
acceleration used in kinematical relativity, had far-reaching consequences in that theory. 
In particular, if it were postulated that the “fundamental” observers had a linear velocity- 
function, it followed that the non-fundamental or “free” particles obeyed Milne’s acceleration 
formula. This investigation dealt only with motion “in the line of sight” of an observer, 
all orientations around him being regarded as equivalent. 

In this note I have attempted to investigate the question of clock regraduation in general 
relativity under the same limitation (spherical symmetry around an observer) as before. It 
was of interest to discover whether clock regraduation would have as important consequences 
in general as in kinematical relativity, and the investigation was prompted by a remark f of 
Dr A. G. Walker’s which reads: “ in general relativity the time which is identified 
with that measured by the physicist is the interval [ds . Whatever is done to the coordinates, 
this can on no account be altered, for otherwise the metric value would be changed and the 
associated Riemannian space deformed. In all work on general relativity which I have seen, 
there occurs not one example of a material system being described by more than one 
Riemannian space and yet this is necessary if clocks were to be regraduated in the sense 
of Milne.” 

If therefore I have understood Dr Walker aright, clock regraduation should change the 
metric of space-time intrinsically and this should be universally true, i,e. if any observer 
(defined according to the postulates of general relativity) in any space-time were to regraduate 
his clock which measures fds , the space-time would be altered to one of different curvature. 
A preliminary examination of the question—which is all that is attempted here—may therefore 
be carried out by discussing a special class of space-times. The method which we shall use 
is, we believe, of more general application and could be extended to other types of space-time. 

Clock regraduation is defined as follows:— 

An observer has a clock which reads time T 0 . He alters the graduations on the clock-face 
in any manner so that the clock reads time T x where 

' 00 

and F is a function expressing the arbitrary change in the marks on the clock-face. 

If, however, the observer is of a scientific turn of mind he need not alter the graduations 
on the face of the T 0 clock, but secure a second clock (the T x clock) with the same graduations 
as the first, this clock working at a different rate relative to the original one. The function 
F then expresses the relationship between the rates of working of the two clocks. In this 
way the observer could refer his description of events at will to either clock and discover 
whether any significant change was produced by so doing, 

2. The Postulates of General Relativity 

We require to know exactly what postulates are the basis of general relativity and the 
following are a sufficient set:— 

(a) The equations of mechanics, and of mathematical physics generally, are expressible 
in tensor form. This provides a rule by which an observer may rewrite his equations when 
he changes from one coordinate-system to another. 

* JProc. Roy, Soc. Edin ., LXI, A, 210, 1942. t Observatory, LXIV, iS, 1941, 



148 


G. C. McVittie 


( 6 ) Events are represented by points in a 4-dimensional space-time with metric 

ds 2 =g i jdx i dx i (i,j-x, 2,3,4), (2) 

one coordinate being time-like and the other three space-like in one at least of the coordinate- 
systems employed by the observer, and the g ti being functions of the (x*). The interval ds, 
between two events (V) and (x*+dx?), is not integrable unless a curve joining the two events 
is also specified. 

(c) Material particles (and observers) trace out the non-null geodesics of space-time. 
The integrated interval along the geodesic is defined as the time kept by a clock travelling 
with the observer and is called his proper-time. 

(d) The paths of light-rays are the null-geodesics of space-time, i.e. those for wklch ds = o. 

(e) The dy namic al properties of the material content of space-time are reflected in the 
geometrical properties of space-time, the precise relationship being defined thus: The 
energy-tensor (T i} ) of the material content is related to the curvature-tensor (G u ) of 
space-time and its invariant curvature ( G ) by Einstein’s gravitational equations 

- K T il = G ij -\(G-2\)g ii , (3) 


where k is proportional to the constant of gravitation and A is the cosmical constant, these 
being regarded as physical constants independent of the particular space-time used. 

A word of explanation regarding these postulates is perhaps necessary. There is firstly 
the question of the determination of the g iS . It is usual to say that they are given by equations 
(3), but a little consideration will show that we have here a vicious circle. For an observer 
cannot measure the values of the components Tij until he has set up his coordinate-system 
and knows the geometry of space-time, i.e. unless he knows the g i5 as explicit functions of 
the coordinates. But he cannot know the g iS until he has introduced specific values for the 
T i5 in (3) and solved these differential equations. In practice this difficulty is overcome by 
assigning a priori values to the which are deemed on general grounds to correspond to 
the physical situation under contemplation. More exactly, postulate (e) is applied in two 
stages. In the first stage it is used in a “qualitative” way to secure agreement between the 
general qualitative properties of the material distribution and the geometrical properties of 
space-time. In the second stage it is employed in a “quantitative” manner by giving 
explicit values to the T iS and solving (3). Examples may make these points clearer. In 
finding the gravitational field of the Sun the first stage is represented by the choice of a 
static and spherically symmetric metric for space-time which determines all the except two. 
These two are, however, restricted to being functions of the radial coordinate alone. The 
second stage is represented by the assigning of zero values to the and to A and the consequent 
solution of equations (3) for the surviving g ijt In the Expanding Universe theory, the 
application of postulate (<?) does not go beyond the first stage. General considerations of 
homogeneity, uniform motion of matter in all directions, etc., determine the metric to within 
one arbitrary function of the time and one arbitrary constant. Equations (3) are then merely 
used to calculate the corresponding components of the energy-tensor in a space-time regarded 
as known. 


Two points also require consideration regarding the geodesic postulate (c). In the first 
place,^ we have Identified observers with freely moving (in the sense of general relativity) 
material particles. This we do because we assume that our observers, like terrestrial ones, 
have material bodies and carry material instruments with them and that they are not subject 
to constraints outside the scope of general relativity (e.g, to the electromagnetic forces of 
“unified” field theories). Though the amount of matter represented by the observer and 
his instruments may be small compared with that of the system he is contemplating, it cannot 
be taken as strictly zero. In the second place, we regard the postulate as independent of the 
test, as an expression of the principle of equivalence * It is tfue that Eddington f has sought 
to deduce it from equations (3) by using a special form of T u and proceeding to the limit. 


. * * or example, A. Einstein, Ann.de Physik, xlix, 1916: French translation in Lee Fondemmts de la 

et Cie> Paris> i933; or *■ c - Toiman ' ****»• narm °- 
t -Mathematical Theory of Relativity , § 56, 2nd e’d., Cambridge, 1924. , 



Regraduation of Clocks in Spherically Symmetric Space-times of General Relativity 149 

But this investigation, in our opinion, shows no more than that (3) is consistent with the 
postulate. And indeed Eddington’s concluding remark that his proof “ does not add very 
much to the argument of § 17” (where he discusses the principle of equivalence) lends colour 
to this interpretation of his work. Before leaving the subject of postulate (c), we remind the 
reader that the differential equations of the geodesics are (in terms of an arbitrary parameter p) 

d 2 x z j i \dx li dx j dx z d 2 s j ds 

dp 2 \kj] dp dp dp dp 2 j dp °’ ^ 

and that they reduce to 

d 2 x z f i ’\dx h dx* 
ds 2 \kjj ds ds ° 

when the arc-length s along the geodesic is used as parameter. 

Finally, it is mathematically obvious that these postulates could be replaced by others 
not logically reducible to them. Examples of theories in which most of the postulates are 
abandoned are provided by the many “unified” field theories of gravitation and electro¬ 
magnetism, by kinematical relativity or by G. D. Birkhoff’s * “relativity in flat space-time.” 
Less generally, Levinson and Zeisler f have investigated the possibility of altering equations (3) 
alone, for the case when the T# and A are all zero. None of these theories can, however, 
claim to be general relativity, which is based on the simultaneous use of all five postulates, or 
on others equivalent to them. Recognition of this fact does not preclude us from analysing 
the consequences of each postulate in turn. We shall therefore discuss proper-time re¬ 
graduation firstly using postulates (a) to ( d) and postulate ( e ) in its qualitative aspect, and then 
showing the effect of introducing the quantitative aspect of (<?). 

3. Ordinary Regraduation 

Before proceeding with the general problem we point out that the regraduation of clocks 
has, in one sense, always been a feature of general relativity. This is a consequence of the 
principle that an observer may use a time-coordinate other than his proper-time for describing 
the events which he sees around him. For example, let the observer have determined the 
metric of space-time l to be 

ds 2 ~g(t, r)dt 2 - h(t, r)dr 2 - r 2 dd 2 - r 2 sin 2 6d<j> 2 , 

where g , h now stand for known functions of t and r. For definiteness, suppose that the 
observer’s geodesic § is ^ = 0, 0 = o, <£ = 0. Then the relation between his proper-time s and 

coordinate-time / is _ 

s=SVg(i! > o)dt, 

which is precisely of the form (1), so that the observer may be said to regraduate his proper¬ 
time clock when he describes events along his geodesic in terms of the same time-variable 
as he uses for all other events. Alternatively, it follows that the observer carries more than 
one clock with him, one of which reads proper-time and the rest read one or other of the 
coordinate-times. 


4. Regraduation of Proper-Time 

We now turn to the discussion of the effect of applying formula (1) to proper-time directly 
and not, as in the last paragraph, through the intermediary of an already established coordinate- 

* G. D. Birkhoff, Proc. Nat. Acad. Sci ., xxix, 231, 1943, and LX, 324, 1944 * 
f The Law of Gravitation in Relativity , Chicago Univ. Press, 1929. . 

J Purely as a matter of convenience r is measured in the same units as A To convert to units of length, 
r must be multiplied by the velocity of light, c. / g«.\ , 

§ By evaluating the equations (5) for this space-time, the condition “° must he imposed on^v 

P.R.S.E.— -VOL. LXII, A, I944“45» *ART II > 11 



^_ G. C. McVittie 

time. We shall consider only the particular class of space-times * with metric 


ds* = dfi - - 4 rJfn 2 (f)dr 2 + r 2 d 6 2 + r 2 sin 2 ddcj> 2 }, 


(6) 


and we shall assume that this form has been arrived at by our observer through the use of 
postulates (a) to (d) and postulate (e) in its qualitative aspect only. We also assume that our 

observer’s geodesic is / \ 

s = t, r = o y 0 = o, 9 = 0. \V 

These assumptions are equivalent to saying that the observer has decided that the material 
distribution he is observing (i) possesses spherical symmetry round himself so that for a given 
value of a time-parameter t, the distribution changes only with a distance-parameter r\ and 
(ii) that in terms of t, the distribution remains similar to itself, but on a different scale, at 
successive instants t. In this way he has concluded that the metric of space-time contains 
two functions m ,* of r,t respectively, which enter into the coefficients of the metric in the 
way exhibited in formula ( 6 ) but which are otherwise unspecified. 

Regraduation of clocks in kinematical relativity plays an important role because the time 
and radial coordinates used in that theory are defined by means of clock-readings on the 
observer’s clock. We can introduce coordinates of this type in general relativity also, which 
we call light-signal coordinates and which we shall use in preference to r and t. These 
coordinates are defined as follows:— 

L*et the observer whose geodesic is (7) be called the observer A and let there be another 
observer, B, who is described by A as having the coordinates \ t,r. Let B have a mirror 
which he uses to flash back to A light-signals received from A. We write 


N(x) = $n(x)dx, Mix) = fm(x)dx. 


( 8 ) 


Then a light-signal incoming at B which left A at A’s proper-time has the equation 

- N(t) =W(q) +M(r) - M( o), ( 9 ) 


and the outgoing signal from B which reaches A at A’s proper-time s 2 is 

Nit) = W(rj) -Mir) +M(o). (10) 

We define (s lt s 2 ) as the light-signal coordinates J of B: they are linked to his t, r coordinates 
by the equations § 

^W-W^+WCq)}], («) 

^=J/-i[i{W(r i! )-W(r 1 )}-l-il/(o)). (12) 


We also require the expression of the metric (6) in terms of light-signal coordinates which 
is obtained by performing the coordinate transformation (11), (12). It is 


n^n^ds-yds^, - r z d 8 2 - r 2 sin 2 6 dcj > 2 

W) 3 


(* 3 ) 


where, for brevity, t and r stand respectively for the right-hand sides of equations (11) and (12). 
Hence the statement that the metric contains the unspecified functions m(r) and n(t) is now 
replaced by the statement that the metric contains the unspecified functions n and r of s l7 s 2 
which enter into the coefficients in the way exhibited in formula (13). 

Suppose now that A regraduates his proper-time clock to read S instead of s where 


s=-F(S)- 


(14) 


* The “expanding universes” of general relativity are the special cases of (6) in which 
m\r) = 1/(1 - kr^ljR 2 ), £ = +1, o, -1. 

t Th§ angular coordinates 0 , <f> of B do not enter into this discussion, 
f In kinematical relativity, coordinates t m> r m are used which are the linear combinations 
“ ^l} Of ^1} $2* 

§ We use the index -1 throughout to denote an inverse function , not a negative power. 



Regraduation of Clocks in Spherically Symmetric Space-times of General Relativity 151 
Transforming the equation of A’s geodesic to the parameter S, it becomes 

-F(*S)=/, r=*o, 9 -o, <f> = o. 

It is, of course, open to A to regraduate his /-clock as well as his proper-time clock. If he 
applies the same regraduation to both his clocks, i.e. if he regraduates t to r, where 

/ = F(t), (15) 

then the equation of his geodesic becomes 

S=r, r = o, 6 = 0, <£ = o. (16) 

Thus a simultaneous regraduation of the two clocks reduces the equation of A’s geodesic to 
its original form (7) so that, as far as this equation goes, the regraduated proper-time clock is 
“just as good as” the original one. 

Next consider the effect on the expression for the interval ds between the two events 
(s 2 , s x , 6 , <f>) and (r 2 + ds 2 , s 1 + ds x , 6 + d 9 , cj> + d<f>). We write 

JV (x) = JV[F (x)], N (x) = \n(x)dx, (17) 

so that 

n(x) = n[F(x)}. ^^ = n[F(x)] ,F'(x). 
ax 


Hence using (15), equations (11) and (12) become 


t=W- 1 [KW( 5 2 ) + W( 5 1 )}], 

(r8) 

r=M~^{JV(S 2 )-tf(S$+M(o)]. 

(19) 

The expression for the metric (13) is then 


ds* = {F'(r)} 2 dS 2 , 

(20) 

where 


niSJniSJdStdSj. - Fd 9 * - r 2 sin 2 6 d<f? 

n\r) ’ 

(21) 


and r, r stand for the expressions on the right-hand sides of (18) and (19) respectively. 

At this point, having used the postulates of general relativity except for postulate (e) in its 
quantitative aspect (equations (3)), we might construct the argument given below to justify 
the proposition that the regraduation of a proper-time clock altered the space-time—from (13) 
to (21) in our case. We do not claim that the reader will be convinced by this argument; 
we are only concerned to show how the idea that the space-time is changed may have arisen. 
The argument would run as follows:— 

(i) Every event (s X9 s 2 , 0 , </>) in (13) is represented by an event (S l7 S 2 , 9 , <j>) in (21). 

(ii) On the one hand, all that was known of the space-time (13) was that its metric had 
the form exhibited in (13) in which two undetermined functions, n and r, of s x , s 2 entered 
in a certain way. These functions involved s l7 s 2 in the particular combinations given by (11), 
(8), and (12), (8) respectively. On the other hand, the metric (21) has the same form as (13), 
and its coefficients also contain two unspecified functions n 7 r of S x , S 2 . These functions are 
the combinations of S x , S 2 given by (18), (17)7 and (19), (17) respectively, in which the 
unspecified functions n , m play exactly the same r6!e as did previously the unspecified functions 
n 7 m, 

(iix) The finite equation of the observer’s geodesic is formally the same in the space-time 
(21) as it was in the space-time (13) (equations (7) and (16)). 

(iv) The differential equations of all other geodesics (null and non-null) in the space- 

time (21) are obtained by rewriting those of (13), replacing s x , s 2 by S x , S 2) and the functions 
«(*i> *2), <s x ,s 2 ) by n(S l9 S 2 ), r{S x , ^respectively. . . . 

(v) Hence finally, the observer might conclude that the space-times with metrics (13) 
and (21) were indistinguishable as far as his description of the material system had so far 
gone, and that therefore regraduation of proper-time had produced a transition from one 
space-time to another of, in general, intrinsically different curvature. 



152 G. C. McVitHe 

We note that we have so far used only the kinematical postulates of general relativity, 
together with the postulate that there is a qualitative parallelism between the properties ol 
the material system and the geometrical properties of space-time* 


5. Proper-Time Regraduation and Einstein’s Gravitational Equations 

The remark at the end of the last paragraph makes it clear that the theory we have 
considered so far is not, strictly speaking, general relativity, an integral feature ot which 


is equations (3). We now examine the effect of taking them into account. 

For brevity we write 

S 2 ~x 2: <j)~X£ } a 888 log (t)] , ( 22 ) 

so that (20), (21) now become 

ds 2 = e 2 a dS\ (23) 

dS 2 = 2g n dxydx % + g^dx* + gudxt ( 2 4 > 

where g 12 =£21 = 2^iMtf 2 )/^ 2 ( T )> ^33=£W sin2 (25)- 


and the remaining g i5 are all zero. In these equations r, r are given by (18) and (19) with 
SuSt replaced by x l9 x 2 respectively. 

We denote by T {} - the energy-tensor calculated for the metric (23) from the right-hand 
sides of equations (3) and by t iS the energy-tensor similarly calculated for (24). Using 
formulae given by Eisenhart * we can express in terms of as follows:— 

- kTu ~ - Kt u + 2<r fi +gn{(e 2a -1)A - 2 Atfx - A x 0}, (26) 


where 


d 2 cr da (k\ da 8 a 
° ij dXjdXj dx k [zjj dXi 8 x/ 



(27) 

(28) 


AyG ~ g i 


.da da 
dx 4 dxS 


and the | „j are the Christoffel symbols for (24). In obtaining (26) we have assumed that 


scales are chosen so that the numerical values of c , /c, and A do not alter in passing from the 
space-time (23) to the space-time (24). 

If now it could be shown that 


Tij — hh 


the argument at the end of the preceding section would be considerably reinforced. The 
observer would then indeed be justified in thinking that regraduation of proper-time had 
produced a change from the space-time (23) to (24) and that this change had left the 
energy-tensor (in its covariant form at least) unchanged in value. , But we can show that 
(3°) is, in general, impossible except for two trivial types of regraduation, as follows;— 
Consider the infinitesimal regraduation in which 


a = €$(x y,x 2 ), ( 31 )' 

where powers of e higher than the first are negligible. We then have that J x a = o, and that 
the third term on the right-hand side of (27) is also negligible. If (30) be true, then T X1 ** hi 
and / 22? and these equations, by (25) and (26), reduce to G n — o and.o^-o. Calculating 
the Christoffel symbols for (24), the last two equations become 


gv 1 % _ fq I 8 g u dg 

^ X l il%& x l $ X 1 0? Bx\ gyidx z dx 2 0? 

L. P, Eise nhar t, Rtemannian Geometry , § 28, Princeton Univ, Press, 1926* 



Regraduation of Clocks in Spherically Symmetric Space-times of General Relativity 153 
whence we obtain 

dq , dq 

dx [ = 121 = ( 3 2 ) 

where Q 2 , Q 1 are arbitrary functions of # a and respectively. Now let F t be the regraduation 
function corresponding to (31). Then 

e<r =K( j )= 1 + e ?( x i> x i) ■ 

But, by (x8), JV(t) = and hence, using (17) and (25), the equations (32) 

become 

R"(r)n(r) = eQfx 2 )n(x 2 ) = e<2i(^i)»(%) = 

Since these must be true for all values of x ± , x 2} a must be a constant. 

Again, to our order of approximation, 

A 2 o=gV<j iS . 

If therefore we must have 


(32a) 


so that 

These last equations yield 

and the equation 
then reduces to 


2?=-K* 20 -!) A-24a}, 

Sij 

^12 __ ^33 __ ^44 
£l2 £33 £44 

A 

£l2 

20, 12 + gxi{( e ** “ i) A - 2/l 2 C7} = O 
&q 






On using (32), (32#), and (18) we find 


+ 3 a ' 


d 


(- 


dr\n{r) 


- Ay = o, 


Differentiating this equation with respect to r and using (320) again, we finally obtain 

1 


{ 3 ^(«(r))"«(r)} °- 


a \ 3 

If therefore we do not impose any restrictions on n(r) } the only solution of this equation is 
#=*0. We shall consider this general case in a moment. We can, however, also satisfy the 
foregoing equation for a^o in the special case for which 

d*( 1 \ A 

^dr 2 \n(r)) n(r) °* 

This means that 


_Vl 


n(r)—e~ 

There is only one permissible regraduation given by ( 32a ), viz.: 

A 

i?.(T)=c+3r + e|«e+ Vr , 

where 0, b, c are constants, and also only one permissible function n in (13), viz.: 


n(x) 





G. C. McVittie 


i 54 

This last result follows from the relation 

Hence, for a metric in which the function n has this particular form, there is a regraduation 
function—depending on n —which makes = t^ and for which a^o. This metric is ot 
“de Sitter” type* 

Returning to the general case a ~o, we have by (32 a) 

Hence we have : if for an infinitesimal regraduation from s to S, then 

s=F 9 (S)~(i+a)S+P, 

where a, ft are constants proportional to e. But this regraduation is trivial as it corresponds 
to a change of origin (fi) of time and to a change of scale in the unit used for time measure¬ 
ment (a). Changes of this type do not produce a transition from one space-time to another, 
as is obvious from the general formula (2) for the metric in which the constant a can be 
absorbed into the coordinates by a change of scale. 

Summarising the results of this and the last section, we may say: 

Regraduation of proper-time from s to S might be regarded as producing a change from 
one space-time to another provided that Einstein’s gravitational equations did not form part 
of general relativity. But since they do, the transition from one space-time to another is, 
in general, impossible because the material energy-tensors are not equivalent. Regraduation 
then reduces itself to a change of coordinates, as is most easily seen by expressing the metric 
in terms of light-signal coordinates, the analytical expressions of the metric, the energy-tensor, 
etc., being altered according to the well-known rules of the tensor calculus. 

It is tempting to go further and conclude that regraduation of proper-time is not possible 
in general relativity, a conclusion which is not justified, in our view, for the following reason. 
Regraduation is defined, both here and in kinematical relativity, by equation (1) and by 
nothing else. There is in particular no stipulation in the definition that the “ regraduated ” 
time-variable must be the arc-length of a geodesic. An observer in general relativity must, 
as we have seen in section 3, carry more than one clock with him (or keep on regraduating 
his proper-time clock!). Application of the transformation (1) to proper-time therefore 
merely produces a change from one of the observer’s clocks to another or, alternatively, to 
one of the parameters, ju, in terms of which the equations of the geodesics have the form (4) 
instead of {5). 


6. A Particular Regraduation 

Regraduation of clocks came into prominence in kinematical relativity but, as we have 
seen, it is possible in general relativity also where it appears as a coordinate transformation. 
To illustrate this we consider a generalised case of the regraduation from “kinematical” to 
“dynamical” time of kinematical relativity which also has the effect of making the space-time 
(6) conformal to a static space-time. 

Since the equations which express r and r in terms of S lf *$ 2 ((18) and (19)) are identical 
in form with those which express t and r in terms of s x , s t ((ix) and (12)), equation (20) may 
also be written 

^ ={ ^ t)P L t2 + 

l n\r) 

It is possible to choose a regraduation function and a corresponding coordinate r m so that 

= i- (33) 



* In the de Sitter universe 1 jn\t) — A*. 
Theoiy y p. 64, Methuen, 1937. 


m{r 


See, for example, G. C. McVittie, Cosmological 



Regraduation of Clocks in Spherically Symmetric Space-times of General Relativity 155 
Let /be the regraduation function in question; then by (33) we have 

</(tJ}./'(tJ = i, 

and since t~f(r m ), it follows that 

r m = $n(f)dt + (constant). 

Hence /is defined by its inverse /~ x by 

/“ 1 (#) = fn(x)dx + (constant). 

Again N(x) = $n(x)dx, so that, for the regraduation which produces (33), we have 

■Nfm) s 

Hence, by (18) and (19), r m and r are given in terms of the light-signal coordinates corre¬ 
sponding to the regraduation ^ = f(S) by 

T m = i(*S 2 + Sj), M(r) = |( 5 2 - 6\) +M(o). (34) 

If therefore we define a new “radial” coordinate by dR m ^m{f)dr and adjust the constant 
of integration so that R m - o when ^ = o, we can write (34) in the form 

t m =¥.s,+s,), ii m =\{s 2 ~s x ). 

Hence r m , R m are coordinates of the kind used in kinematical relativity. We thus reach the 
conclusion: 

In the space-time (6) the observer at the origin of spatial coordinates can regraduate his 
proper-time and his /-clock by 

*-/ 0 S), t=f(r m ), 

where 

f~ x (x) ~ jn(x)dx + (constant), 

so that the expression for the metric becomes 

= {/'(rjndr^ - dRl - [M~\R m +Jf(omdP + sin 2 

where r m , R m are “ kinematical ” coordinates. The metric is thus reduced to a form conformal 
to a static metric. 

The coordinate r m is a generalisation of the “dynamical” time of kinematical relativity 
for any pair of functions n, m. It will be remembered that Milne’s theory throws up a 
particular form of (6) in which 

n(t) = 4//, m\r) - 1/(1 + r 2 /Rl), 

so that ~ / 0 +1 0 log (t/t 0 ), which is Milne’s transformation from dynamical to kinematical 

time. 

I wish to express my thanks to Dr A. G. Walker for many helpful suggestions and 
criticisms. 


Summary. 

The changes in his description of events brought about by an arbitrary regraduation of 
an observer’s clock are examined, taking the axioms of general relativity as fundamental. 
It is shown that regraduation does not imply a change from one Riemannian space-time to 
another but merely a coordinate transformation within space-time. A generalisation of the 
“ dynamical time” of kinematical relativity is a by-product of the investigation. 


{Issued separately November 5, 1945) 



H. S. Ruse 


156 


XVIII.— The Riemann Tensor in a Completely Harmonic V 4 . By H. S. Ruse, 

University College, Southampton. 

(MS. received April 5, 1945. Read June 4, 1945) 


1. Statement of the Problem 

If ( 4 ) is a fixed point of a Riemannian V w of fundamental tensor g ih and if £ is the geodesic 
distance between it and a variable point (pc 1 ), then the V n has been called centrally harmonic 
with respect to the base-point (4) if 

1 d ( .. ds\ 

■“vJ&V" siv 


is a function of s only, and completely harmonic if this holds for every choice of base-point 
( 4 )- A flat Y n (gtf^Sij) is obviously completely harmonic, since for such a space 
5-= ^\/{L(x i - 4) 2 } an d 


A 2 s 3 


d 2 s n-x 
(dx 1 ) 2 s 


The concept of such spaces arose out of an attempt to find a single general formula for 
Hadamard's “elementary” solution of the tensor generalisation A 2V = 0 of Laplace's equation 
(Ruse, 1930-31; Copson and Ruse, 1940). In the second of these papers it was shown how 
to obtain the conditions, in terms of the Riemann tensor, that a Y n should be completely 
harmonic. The first condition was 


(£ const.), (A) 

showing that a completely harmonic space is an Einstein space. The second was a condition 
that may be written 

sww d'Z'gijgMj (B) 

where E' denotes the sum taken over all permutations of the free suffixes i, /, k, /. 6 is a 

scalar and R^j the Riemann tensor (skew in z, j and in k , /). The remaining conditions 
were infinite in number and involved the covariant derivatives of R ?w . They were too 
complicated to be given in terms of the Riemann tensor itself, but were expressed instead 
in terms of the normal tensors. Walker (1942) has given another method of obtaining them. 

Now every Y n of constant curvature is completely harmonic, and it seems quite probable, 
as Copson and I suggested (loc. cit .), that every completely harmonic Y n is of constant 
curvature. It is certainly so for n = 2 and ^ = 3, and Walker has shown it also to be so when 

(a) Y n is conformal to a flat space; 

(b) 4 and Y n is of signature ± 2. 

Condition (A) is alone sufficient to prove the result for ^-3 and for Walker's case (a). His 
(b) requires the use of both (A) and (B). The purpose of the present paper is to obtain all 
types of V 4 that are algebraically possible when the Riemann tensor satisfies both (A) and (B). 
It is found that, when the signature is not ± 2, there is no algebraic necessity for the V 4 to 
be of constant curvature. A fortiori , the same is true for a Y n with n > 5. It therefore 
seems probable that, even when (A) and (B) are regarded as partial differential equations in 
the g i} - and not merely as algebraic conditions imposed upon the Riemann tensor, they cannot 
alone require the to be of constant curvature, though this is a point that is still unsettled. 
So also is the question of what limitation is placed on Y n by the remaining infinite sequence 
of conditions for a completely harmonic space. 

This paper therefore makes comparatively small though definite headway in the problem 



The Riemann Tensor in a Completely Harmonic F 4 157 

of determining the nature of completely harmonic spaces. What may be of greater interest 
than the result itself is the illustration it provides of methods developed in four recent papers 
(Ruse, 1944, 1945 I 945 I 94^), in which the Riemann tensor for a V 4 is regarded as 

defining, at each point (pc*), of V 4 , a quadratic complex of lines in the projective 3-space 
.associated with the point (pc 1 ). The problem of solving (A) and (B) completely as simultaneous 
equations in R^j—for that is what it amounts to—is by no means a trivial one even for the 
case n = 4, and has provided an entirely unexpected application of the theory of the Riemann 
•complex. 

1 < 1 


2. Geometrical Significance of Conditions (A) and (B) 


, It is a consequence of conditions (A) and (B) that, 
•equations 

at every point of V 4 , the 

o, 

(2-1) 


(2.2) 

hold for every vector f * such that 



(2-3) 

that is, for every null vector 



In the projective S 3 at infinity in the tangent-space T 4 at any point (a; 4 ) of V 4 , the equation 


represents the Riemann complex (Ruse, 1944), the p ij being current Pliicker co-ordinates. 
It will be assumed in the first place that this is a proper quadratic complex—that is, that it 
is not a pair of linear complexes. This restriction will be removed later. 

If we write 

Spg 3 S ff2) = Rjp 

then, in S 3 , S^X^X# ~ o is the equation of the complex cone of the point which, by (2.3), is 
.any point on the fundamental quadric. Conditions (2.1) and (2.2) may now be written 

g u S«“0, (2.4) 

g p Y‘ S OT S„ = o. (2.5) 

Equation (2.4) states that the complex cone of any point g* on the fundamental quadric is 
outpolar to that quadric, or, taken in its original form (A), that the quadratic complex is 
self-polar (of the first kind) with respect to the fundamental quadric (Ruse, 1944, p. 715 
and 1946). 

Now 

W l S*S* - SflS*) s 2 (g** S*)* - (2.6) 

by (2.4) and (2.5). 

Consider the pencil of quadrics S# +c rg iS defined by the cone S t7 and the non-degenerate 
•quadric g ti . In the usual notation, its characteristic equation | S w *f ag ti | =0 is 

do 4 4- @cr 8 +<E>a 2 + ©'or + -0. (2.7) 

All the coefficients except A, which is equal to the determinant g } are zero; A ' because it is 
•equal to the determinant | S# |, which is zero because S is is a cone; 0' because the vertex g* 
of this cone lies upon the quadric (Sommerville, 1934, p. 319); O because it is equal to g times 
the left-hand side of (2.6); and © because it is equal to g times the left-hand side of (2,4). 

To find the precise nature of the relationship of the cone S# to the fundamental quadric, 
take a set of non-homogeneous co-ordinates (x, y, z) with origin O at the vertex of the cone, 
and with x - and y -axes along the generators to the quadric through O. The tangent-plane 
to the quadric at 0 is then z-o, and the equations of the quadric (g) and cone (S) are 
respectively of the forms 

(g) cz 2 + 2fyz + 2gzx + 2hxy + zrz = o, t (2.8) 

(S) a'x 2 + b'y 2 + c r z 2 + zfyz + zg'zx + zh'xy = o. (2.9) 



158 H. S . Ruse 

The characteristic equation (2.7) is now 

W + 2MW - r\a'V - J'V -0. (2.10) 

As the coefficient A of a 4 is not zero, neither h nor r is zero. Hence ©=o gives h* ~ o. 
Consequently 0 «o gives a'3' = o, so one at least of and 3' is zero. Take V = 0. Then the 
equation of the cone is reduced to the form 

(S) a! x 2 + + 2/ 'j/s + 2g'2# = 0, (2.1 x) 

where a* may or may not be zero. The Segre characteristic of the pencil is in general [4] 
(Sommerville, 1934, p. 271), but may have one of the forms [(31)], [(22)], [(211)] if one or more 
of the coefficients a\ c\ f \ g r are zero. 

Suppose <z'#o. Then the cone cuts the tangent-plane z~o of the quadric where # a =so 
—that is, it touches it along the generator z = 0 ~x of the quadric. This generator therefore 
lies upon the cone and so belongs to the quadratic complex R im . 

If a ' = o, then the cone degenerates into a pair of planes of which one is the tangent-plane 
a = o to the quadric. The tangent-plane contains the two generators of the quadric that pass 
through the vertex O of the degenerate cone, so both belong to the quadratic complex. We 
therefore have the following theorem:— 

The first two of the conditions for a V± to be completely harmonic , namely conditions (A) 
and {B) ^require in S z that one or both of the generators through every point 0 of the fundamental 
quadric should belong to the quadratic complex R im . 

Conversely, if at least one generator through every point of the fundamental quadric 
belongs to the quadratic complex, and if the complex is self-polar with respect to the funda¬ 
mental quadric (thereby satisfying the Einstein condition (A)), then it also satisfies condition 
(B). For take any point O on the quadric and the generators through it as x- and jy-axe$. 
Its equation is then of the form (2*8), and the complex cone of O, being of vertex O, has an 
equation of the form (2.9). If the generator z — o~x of the quadric belongs to the complex, 
it is also a generator of the cone, so b r = o. As the complex is self-polar with respect to the 
quadric, we also have h =0, as seen above, so by (2.10) the coefficient $ of a 2 is zero. Thus 
for every point on the quadric, which is equivalent to condition (B) when taken in 
conjunction with condition (A). 

3. Consequences of the Theorem 

It follows from the theorem of § 2 that an infinite number of the generators of the funda¬ 
mental quadric—at least one through each point upon it—belong to the quadratic complex. 
I erefore at least one whole regulus belongs to the complex, because a regulus has either all 
0 its lines in common with any quadratic complex or else only four of them, the four being 
not necessarily all distinct. This fact is obvious from the well-known representation of the 
lines of a projective 3-space by points in a projective 5-space (see, e.g., Ruse, 1944, § 7). Thus 
either both systems of generators of the fundamental quadric belong to the Riemann complex 
K im , or else all the lines of one belong to it, and only four of the other. In the former case 
the complex is necessarily harmonic* with the fundamental quadric as one of its defining 
quadrics (Hudson, 1905, p. 97; Jessop, x 9 o 3 , p. 358, ex. 66). So in this case R lW is of the 

s Safin +&iPik-gnfin, 

Pit bein 8 a symmetric tensor. From the fact that 

• (3- 1 ) 

for an Einstein space, it follows at once that 


hpiiii&n - giigjjc ) (ft —£** fitj ), (3.2) 

and the V 4 is thus of constant curvature. 




The Riemann Tensor in a Completely Harmonic *59 

Now suppose that one whole regulus belongs to the quadratic complex, but only four 
of the other. 

At (x { ) in V 4 take an orthogonal ennuple h\ = (4, . . ., h[) in such a way that 

gijh'ahi =$ab 


whether ds % is positive definite at (x*) or not. If it is not, then some of the h\ will be purely 
imaginary. Form the ennuplet components of all tensors at (**); for example, let 

F.a&c<Z — 


Let P ai be the ennuplet components of a simple bivector at (x*)—that is, the Pliicker co¬ 
ordinates of'a line in S 3 with the tetrahedron h l a (a = i, 2, 3,4) as tetrahedron of reference. 
Then if we write 


x1 = _L (/2 3 +/ 4 )) 


x 2 = _L ( ^31 +/ 24) ! x 3 = _L( /12+ ^ )j 


x 4 -^ 23 -/ 4 ), 


x 5 =v^ 31 - 




we get the usual representation of the lines of S 3 by points of a projective space S 5 in terms of 
the co-ordinates x a = (X 1 ? * • *5 X 6 )? which are effectively those of Klein. Greek suffixes will 
always run from i to 6 and will refer to S 6 . (For a more detailed account of the 
notation, see Ruse, 1946, §§ 1, 2.) Then the lines of S 3 are represented in S 5 by the points 
of the 4-quadric 


where (ibid, (2.7)) 


e apX a X 3 = °> 


«wjxV = haboaP ab P cd = (x 1 ) 2 + (x 2 ) 2 + (x 3 ) 2 - (x 4 ) 2 - (x 5 ) 2 - (x 8 ) 2 > (3-3) 


tdbcd being the ennuplet dualising tensor of components ±1,0. The lines of S 3 that touch 
the fundamental quadric form a special quadratic complex, which, in S 6 , is represented by 
the intersection of the e-quadric (3.3) with the quadric 

gaiiX a X 3 = X (X°) 2 = 0 (3-4> 

a=l 

(ibid. (2.8)). The lines of the Riemann complex likewise correspond inS 5 to the intersection 
of the e-quadric with the 4-quadric 

R«0XV s l^ceP ab P cd = o. 

Also, as the Riemann complex for a completely harmonic space is self-polar of the first kind, 
the 6x6 matrix [R a/S ] has the form 

R-a/3 — 



where U, V are symmetric 3x3 matrices and 0 is the null 3x3 matrix (ibid. (3.8)). 

Now the regulus in S 3 that we are supposing to belong to the Riemann complex corresponds 
in S 5 to the conic in which one of the 2 -planes x 1 * 0 - X 2 853 X 3 anc * X 4 ® 0 =* X 588 sa Y former, 
cuts the €-quadric (cf. Jessop, 1903, p. 2x1, § 170). Thus every point (o, o, 0, x 4 >X 5 > X 6 ) on 
the e-quadric—that is, by (3.3), every point (o, o, o, x 4 , x 5 > X 6 ) ^ or which 

(x 4 ) 2+ (x 5 ) 2+ (x 6 ) 2 =^ 


must lie on the 4-quadric R a p. Consequently we must have 

R^=y[(x 4 ) 2 + (x 5 ) 2 +(x 6 ) 2 ] when x 1=0= =x 2;= x 8 > 

y being a scalar. This means that the matrix V of (3.5) must be equal to yl, where I is the 
unit 3x3 matrix. Thus R a/S has the form 



i 6 o 


H\ S . Ruse 


R a ^=ru 


ru on 

Lo ylj 


b 

f 


g 

f 

c 


yj 


(3-6) 


say. Hence, by (3.3), the matrix that determines the e-characteristic (Ruse, 1945 a, § 4) of 
the Riemann complex—that is, its Segre characteristic in the ordinary sense (Jessop, 1903, 
ch. xi; Zindler, 1922, pp. 1x29-31)—is 


Ra*-A eaS =rU-AI 


e a /3 


0 (y + A)I 


° I’ 


hz-a 
h 
g 


h 

b -A 

/ 


g 

f 

c - A 


y + A 


y + A 


y + A. 


(3*7) 


When the signature of ds 2 at (pc*) in V 4 is either (+ + + -) or (-h), then U and V 

are complex conjugate matrices (Ruse, 1946, §4). Therefore, since V = yl, U is equal to 
y*I, the asterisk denoting the complex conjugate. So in this case 

n, , e .IV-w ° 1. 

0 ( y + A)lJ 

and the e-characteristic is therefore [(ui)(m)]. Hence the Riemann complex consists of 
the tangents to a quadric (Jessop, 1903, p. 2x1), and this can only be the fundamental quadric 
itself. We are in fact back again to the case considered above when both reguli of the funda¬ 
mental quadric belong to the Riemann complex, and the V 4 is of constant curvature. The 

cases (+ + + -),( -+) cover all those of one minus or one plus sign, since it is a matter 

of choice which of the four co-ordinates x i of V 4 is initially called x 4. Thus we have re¬ 
established Walker’s result that any completely harmonic V 4 of signature ± 2 is of constant 
curvature. 

In general, however, (3.7) is not of characteristic [(m)(in)], Assuming for the moment 
that none of the latent roots of the matrix U is equal to - y, we obviously obtain all the 
possible characteristics of (3.7) by combining with the Segre characteristic [(in)] of the 3 x 3 
matrix (y + A)I all the possible characteristics of the pencil U - AI. These are 

[hi], [(11)1], [21], [(21)], [3], [(m)] (3.8) 

(Bocher, 1936, p. 309). The following is therefore an exhaustive list of the characteristics 
that are possible for a non-degenerate Riemann complex satisfying conditions ( A ) and (B) 
for a completely harmonic space ;— 

[(iii)iii], [(iii)(ii)i], [(111)21] 1 

[(in)(2i)], [(111)3], [(iii)(iii)]J* 

Of these, the last has already been found and corresponds to the case when V 4 is of 
constant curvature. The others are the characteristics of all the non-special and non¬ 
degenerate complexes that have a repeated quadric, namely the fundamental quadric itself, 
as singular surface (Jessop, 1903, p. 231; Zindler, 1922, pp. 1130-31). 


4. The Degenerate Cases 

Enough has already been done to show that conditions (A) and (B) do not impose upon 
V 4 any algebraic necessity that it should be of constant curvature. This result was established 
on the assumption that the Riemann complex was non-degenerate—that is, that it did not 
consist of a pair of linear complexes. This assumption is equivalent to the supposition made 
above that none of the latent roots of U is equal to - y. We can now remote this restriction. 




The Riemann Tensor in a Completely Harmonic F 4 161 

A direct if slightly laborious calculation shows that, when R im is reducible to the form 
(3.6), it satisfies conditions (A) and (B) whatever the values of the elements a , 5 , c,f, g, h of V. 
Therefore there is no bar to the assumption that some or all of the latent roots of U are equal 
to -y, and we may therefore combine the characteristics (3.8) with [(in)] by enclosing any 
individual index, or any round-bracketed pair or triplet of indices, with the in in the round 
brackets of [(in)]. Omitting the trivial case [(nun)], which corresponds to R im == o, we 
therefore have the following possible e-characteristics in addition to those given above ;— 

[(nn)n]; [(inn)i], [(nn)(n)]; 1 

[(2111)1], [(1111)2]; [(21m)]; [(3111)]/' 


5. Analysis of the Degenerate Cases 


The list (4.1) includes all cases except one, namely [(2211)], in which the quadratic complex 
is a pair of linear complexes (Zindler, 1922, p. 1133). If the linear complexes are m ih n ijy 
where, from the point of view of the underlying V 4 , m u and n ti are bivectors, then the equation 
of the quadratic complex is 

(^# 3 )W/ l ) = o, 

and so 


The identity 
which may be written 
at once gives 


where 


&mi — i{ m ij n 7cl + n i3 m kl} 

Rtf; M + = °> 

le ijkl R im = o, 

P= 

°m kl s \e mi m u 


is the dual of m So if for brevity the inner product r ij s i!f of any contravariant bivector r u 
and any covariant bivector s iS is denoted by (rs), then 


+ n u™ki) - jh( 0fnn )*im- (S-i) 

Detailed analysis similar to that given above for the non-degenerate cases shows that, if 
R im satisfies conditions (A) and (B), then 


where 


°% s= ^%, °Hi^en ih 

e= ± 1. 


Cs.a> 


Geometrically these equations mean that each of the linear complexes m ii9 n {j is self-polar 
with respect to the fundamental quadric. Condition (A) by itself merely requires the 
Riemann complex to be self-polar of the first kind, which does not rule out the possibility 
that the linear complexes m ih n i5 in S 3 should be polar to one another even when they are 
non-coincident; it is condition (B) that restricts each to be self-polar. In V 4 , equations (5.2) 
mean that the bivectors are either both self-dual or both anti-self-dual. 

If and n u happen both to be special linear complexes, then, by (5.2), their directrices 
are generators of the same system of the fundamental quadric. That they could not belong 
to opposite systems is otherwise evident. For, if they did, we should have — o because 
they intersect, while, instead of (5.2), we should have 

°m u = em ih ' °n ij =- -en ij: 

and hence, by (5.1), 



162 


H. S. Ruse 


and the Riemann complex would be self-polar of the second kind (Ruse, 1944 (6.12)). But 
as it is known to be self-polar of the first kind, we already have °R— R uni an< ^ hence the 
Riemann tensor would be identically zero and the Riemann complex non-existent. Hence 
the case when the Riemann complex is a pair of special linear complexes with intersecting 
directrices is ruled out, this being the case [(2211)] referred to above. All the other 
degenerate cases appear to be algebraically possible. 

Consider in particular the case [(11111) 1]. This is the case in which the Riemann complex 
is a repeated linear complex, obtained from (5.1) by putting n iS = We have, therefore, 

Rmi ® m a m u “ 1( 5 * 3 ) 

where, by (5.2), m i3 - is such that 

°m ij = ± m iS . ( 5 . 4 ) 

But, as is well known, any skew-symmetric tensor m i3 - in V 4 satisfies the identity 

°m ik m kl = - 

and so, by (5.4), 

m ik m kl = - i ( 5 -S) 

Multiplying (5.3) by g jk , summing for j, and raising the suffix 2, we obtain, with the help 

of ( 5 . 5 ), 

Ht - (5.6) 

which verifies that the V 4 is an Einstein space. Further, 

R rijs R rkls = - ^(°mm)€ rijs ] [m rk m u - ^( < Wi)€ rM J 

—m ri m rk m j8 m ls J r terms involving the e-tensor 

= T V(^^) 2 S]fcSz+ . . . 

by (5.5), whence, lowering /,/, 

g vr g Q8 R vi3Q R. r jcis - T*(™™) 2 gikSn + • - . 

If the sum is now taken over all permutations of *, /, h, /, the terms involving the e-tensor 
(represented by dots in the last equation) disappear on account of the skewness of € im in all 
its suffixes, and we obtain 

(5.7) 

which verifies that, if the Riemann tensor has the form (5.3), it satisfies condition (B) for a 
completely harmonic space. This example is alone sufficient to show that conditions (A) and 
(B), treated purely algebraically, do not require the V 4 to be of constant curvature. 

If in (5.3) we take (°mm) = o, then the linear complex in S 3 is special; and, as it is self- 
polar, its directrix is a generator of the fundamental quadric. The Riemann tensor now has 
the form 

R mi ~ 

and the right-hand sides of (5.6) and (5.7) are both zero. So in this case the Riemann tensor 
satisfies conditions (A) and (B) with k and 8 both zero. The e-characteristic is [(21m)] 
(Zindler, 1922, p. 1133, no. 57). An example of a non-Einstein V 4 with this e-characteristic 
has been given elsewhere (Ruse, 1945 a, § 5), but it is open to doubt whether an Einstein V 4 
can in fact be as specialised as this. 


REFERENCES TO LITERATURE 
B (3 CHER, M., 1936. Introduction to higher algebra (Macmillan, New York). 

^° PS LX * I 94 °" “ Harmonic Riemannian Spaces,” Proc. Roy. Soc. Edinburgh , 

Hudson, R. W. H. T., 1905. Rummer's quartic surface , Cambridge. 

JESSOP, C. M., 1903* A treatise on the line complex , Cambridge. 



The Riemann Tensor in a Completely Harmonic F 4 163 

RUSE, H. S., 1930-31. “On the ‘elementary’ solution of Laplace’s equation,” Proc. Edinburgh 
Math. Soc. (2), II, 13 5-139. 

-, 1944. “On the line-geometry of the Riemann tensor,” Proc. Roy. Soc. Edinburgh , A, LXII, 

64 - 73 . 

-, 1945 “The five-dimensional geometry of the curvature tensor in a Riemannian V 4 ,” 

Quarterly Journal of Math. (Oxford) (in press). 

-, 1945 b. “Sets of vectors in a V 4 defined by the Riemann tensor,” Journal London Math. Soc. 

(in press). 

-, 1946 (?). “The self-polar Riemann complex for a V 4 ,” Proc. London Math. Soc. (in press). 

SOMMERVILLE, D. M. Y., 1934. Analytical geometry of three dimensions , Cambridge. 

WALKER, A. G., 1942. “Note on a distance invariant and the calculation of Ruse’s invariant,” 
Proc. Edinburgh Math . Soc. (2), vii, 16-26. 

ZlNDLER, K., 1922. “ Algebraische Liniengeometrie,” Encyklopadie der Math. Wiss., ill, C 8, 

973-1228. 


Note added in proof\ 10th September 1945.—Since this paper,was written, a paper “Sur 
les espaces riemanniens completement harmoniques,” by A. Lichnerowicz, has appeared in 
the current volume (1945) of the Bull. Soc. Math, de France. Professor Lichnerowicz, with 
whom I have lately been in correspondence, has also made some headway in the problem 
of determining the nature of completely harmonic spaces, though by methods entirely different 
from those of this paper. The question whether such spaces are all of constant curvature 
remains open. 


{Issued separately November 5, 1945) 



164 


A. G. Walker 


XIX.— A Theory of Regraduation in General Relativity. By A. G. Walker, 

D.Sc., Department of Pure Mathematics, University of Liverpool. Communicated 
by Sir Edmund Whittaker, F.R.S. 

(MS. received April 17, 1945. Read June 4, 1945) 

1. It was remarked by me a few years ago that temporal regraduations, other than trivial 
changes of zero and unit, had not so far been considered in General Relativity. An interesting 
paper by Dr G. C. McVittie * has now appeared in which regraduations are examined in 
certain spherically symmetric space-times. Under the assumptions made by McVittie it 
is shown that regraduations can exist for some but not all space-times, those for which they 
can exist being of a very special form which excludes many space-times generally regarded 
as significant or interesting. In the present paper I take the matter further and discuss the 
problem with more generality. It will be shown that the existence of non-trivial regraduations 
depends firstly upon which theory is being assumed for the derivation of the conservation 
equations T/„*=o. There are two alternatives, and regraduations are found to be excluded 
by one, the “geodesic” theory, but not necessarily by the other, the “equivalence” theory. 

When the equivalence theory is adopted, the general problem of regraduation depends 
upon what we mean by physically equivalent models, i.e. models transformable into each 
other by regraduations. A precise definition is given in § 5, and this I shall call the “theory 
of regraduation”, since it is additional to what is usually understood as the General Theory 
of Relativity. It is, however, consistent with, and in the spirit of, General Relativity, and 
leads to remarkable results in connection with the Lemaitre universes. For example, a 
non-static Lemaitre model can in general be regraduated to become static,* and, even more 
remarkable, a regraduation can always be found which will “transform away” the cosmical 
constant. Lemaitre models are the only systems examined in detail, for it will be shown 
that with few exceptions these are the only systems in which pressure is isotropic and which 
admit non-trivial regraduations. 

In order to explain the present theory it is necessary to examine closely the relations 
between the whole and the local space-times, and the constructions of the tensors gy and T^, 
This is done in §§ 1, 2, 3, and a discussion of the field equations is given in § 4. Many 
standard results are given without proof or reference, and a knowledge of the main features 
of General Relativity is assumed. 

2. The concept of a coordinate system is fundamental in General Relativity, and an event 
may be defined as an ordered set of four numbers f x*. In order to avoid confusion between 
regraduations and coordinate transformations, the coordinate system will at first be regarded 
as invariable, the coordinates of an event being permanent labels attached to the event. Thy! 
physical system (P 4 ) under consideration is the set of all events, and space-time (V 4 ) appeal 
when a Riemmanian structure is associated with P 4 . 

Associated with an observer is a one-dimensional continuous set of events called the 
observer’s world-line ; an event on this line is at the observer, and the world-line may be 
described as the observer’s “history”. At each event E are infinitely many observers, and 
each observer possesses a local coordinate system, X a , for the description of his infinitesimal 
neighbourhood in P 4 . Events for which X x = o are at the observer, and in particular, E is 
the event X a =o. For any other event X a , X° is the interval of time and X x the cartesian 
components of distance from E to the event. The X a are infinitesimals and the ratios X x /X° 
are components of velocity. The scalar velocity of light at E is the same in all directions, 

* 1945- I am indebted to Dr McVittie for permitting me to refer to his MS. 

ta ^ e va ^ ues °> h 3, 3, and Greek suffixes will take values 1, 2, 3. The summation 
invention wiU apply only when a suffix is repeated and occurs once as a subscript and once as a superscript. 



A Theory of Regraduation in General Relativity 165 

say c in the units of X a . The proper vector of an observer at E is the vector 8 0 a in his local 
system, and the proper fundamental tensor is 8 ab , where S 00 = i, 8-r* 2 , and S a6 = o 
(a&b). 

Each other observer at E has a similar local system, and his units can be chosen so that 
his local system is related to the above system by a Lorentz transformation. Transforming 
this observer’s proper vector, it becomes say h a in the first observer’s local system. If now 
E'(X) is an event near E, then the temporal and distance measures of the interval EE 7 as 
given by the second observer at E are the projections of the vector X a along and orthogonal 
to h a , the latter projection being then multiplied by c; here “projection” and “orthogonal” 
are dJpSbd with respect to 8 ab . In particular, if X a = <=h a , then E' occurs at the second 
observer, and e is his interval of proper time between E and E'. * 

Consider now the relations between local systems and the basic system x*. Then for 
each event E(#), the X a are infinitesimals localized at E and will correspond to differentials 
dx i by relations of the form 

dx i = A a i (x)X a ( |A</| 96o). (1) 

The coefficients Af at each event are used to transform tensors from local to basic coordinates. 
An observer’s proper-vector is now &* — Afh a , and from the local tensor S a& we get the field 
tensor 

g is {x) =S a6 A j ffl (x)A/(3c), (2) 

the matrix (Af) being the reciprocal of (Af). These new tensors are independent of the 
particular local system used in the transformation (1), the A’s transforming with the X’s in 
passing from one observer to another so that dx i in (1) are unaltered. 

We now have k fundamental tensor field £#(#), and a proper vector h l for each observer 
at each event. The differential quadratic 

ds 2 —g i jdx i dx j (3) 

is taken to be the metric of space-time V 4 ; from (2) it has signature -2. From (2) and 
the definition of h l 9 it follows that h* is a unit time-like vector in V 4 , i.e. satisfies 

gyhW-l. ( 4 ) 

It also follows that: 

For an observer whose proper vector is k l at an event x i , the temporal and distance measures 
cf the interval between the events x i and x i + dx i are gyhfdbtf, and c times the component of the 
Sector dx l orthogonal to h i ) respectively . 

v An observer’s world-line is now a curve in V 4 , and from the definition of proper-vector 
follows that the unit tangent vector to this curve at any event is the observer’s proper-vector 
t that event. Also, from the above theorem, the interval of proper-time between any two 
tents of the world-line is fads where ds is given by (3). 

< From the definitions of c and 8 0& it follows that intervals along light paths at an event E 
itisfy 8 ab X a X b ~o. In basic coordinates, therefore, we see from (1) and (2) that a light 
* ath satisfies g ij dx i dx s ~ o, i.e. is a null curve. The adoption of Fermat’s principle now leads 
t the result that light paths are null geodesics in V 4 . 

• This completes the description of the purely kinematical properties of space-time. 

3. The dynamical properties of a physical system and their representation in space-time 
concern the energy tensor T i3 \ This is primarily a local tensor, and is best described at an 
event E in terms of the local system X a of an observer having the mean motion of the matter 
in the neighbourhood of E. If p is the mass-density and p\* the principal pressures measured 
by this observer, and if the axial planes dX x = o are taken to be the principal pressure planes, 
then T ab is defined by 

T 00 =p, T u =A*, T a& = o ( 5 ) 

Transforming now to basic coordinates by means of T* 3 ~ T a& A a *Aj/, we find 

T# —p'bdh? + c~ 2 Xpyfhfhf r 

P.R.S.E.—-VOL. LXXI, A, I945, PART II 


12 


(6) 



166 A. G. Walker 

where h l is the proper-vector of the above observer and k^ 9 A — i, 2, 3, are certain unit space¬ 
like vectors orthogonal to each other and to h % . 

This vector h l will be called the stream-vector at E. It is unique except when p « - c~%* 
for some A, and since this relation implies either empty space or large negative pressure, it is 
not satisfied in any sensible physical system. An important property, which follows from the 
form of (6), is that the stream-vector is the unit time-like principal vector of the energy tensor . 
The world-lines whose tangent vectors are stream-vectors will be called stream-lines . In 
general, therefore, there is just one stream-line through each event. Observers having these 
curves as world-lines will be called fundamental observers . 

The pressure at E is isotropic if p x * =pf = p z * - p *, say. It is convenient to write p - <r 2 /* 
and call p the pressure at E. It is in fact the equivalent mass-density, the mass-equivalent 
of energy V being Vc~ 2 , We now see from (6) and the properties of orthogonal vectors that 
when pressure is isotropic, 

T^" = (p +p)h i h 3 ' -pg i3 \ ( 7 ) 

An important quantity which gives the density of matter at E apart from radiation-mass 
(which is included inp) isp 0 -g^T ij = T. From (7) this is equal top -3^. 

4. It is part of the general theory of relativity that the equations expressing conservation 
of energy and momentum are taken to be 

T#,-o. (8) 

There are two standard derivations: the first is from the hypothesis (Synge, 1934) that the 
paths of free particles are geodesics in V* and the second is from the classical equations of 
motion and conservation by means of the Principle of Equivalence (Eddington, X930), It 
can be deduced from (8) that the path of a particle under certain conditions of symmetry and 
isolation is a geodesic. On the whole, however, the geodesic hypothesis is the more restrictive, 
as will appear from the results of the present paper since this hypothesis excludes regraduations 
whereas the other does not. 

Another part of the general theory is the identification of T# with a tensor involving only 
ga and its derivatives apart from constants. The possible forms of this tensor are restricted 
by (8), and the field equations adopted by Einstein are 

~ kT a ~ G it ~ £ G ga + A g iif (9) 

where T# is the covariant form of T i3 \ Gis the contracted curvature tensor of V 4 , G = g {i G^ 9 
and k and A are constants (k=¥ o). 

The simplest interpretation of the field equations is that if gij(x), k and A are known, then 
Ttf and hence the physical state at every event-is known, and we have a complete description 
of some physical system. This, however, is usually regarded as logically unsatisfactory 
because it subordinates physics to geometry. Alternatively it has been suggested that 
knowledge of the physical system gives T#(#) and that the field equations then serve to 
determine g iiy the basic coordinate system having been previously fixed in some primitive 
way, and k and A being supposed known. This again is not logical, for knowledge of the 
physical state at an event E gives both T# and g i3 at E. This follows from the fact that if 
X* are the special local coordinates described in § 3, then the relations (1) as well as the 
density and pressures are presumed known at E, and these lead to g u in (2) as well as to T ir 

We conclude, therefore, that the field equations are in the nature of a test. Given the 
totality of local measures of a physical system, including the relations (1) for some basic 
coordinate system, then g i3 - and T# are determined at every event, and the physical system 
possesses the structure of general relativity if (9) is satisfied for some constants k and A. 
This point of view introduces two new ideas. Firstly, k and A are now regarded as determined 
by the field equations rather than put into them, and any physical significance which these 
constants may have (e.g, k = 8777) is presumed to be implicit in the field equations. Secondly, 
all local measures and hence also g i3 and T^ depend upon the units of time, length, and mass 
adopted at each event. There are no a priori restrictions upon these units, but the field 
equations prevent them from being completely arbitrary. These equations thus serve to 
determine preferential scales of time, length, and mass for any particular system. 

When the field equations permit changes of scales, we say that the system admits a 



A Theory of Regraduation in General Relativity 167 

regraduation. For the remainder of this paper we shall be concerned with the study of 
such systems. 

Returning to the constants k and A, these are dimensional and may therefore alter in 
value with a regraduation. A novel feature, however, is that the present interpretation of 
the field equations allows A to vanish for some systems even when A=t=o for others. In a 
Lemaitre model, for example, it will be found that if A is non-zero, then there exists a re¬ 
graduation which has the effect of making the new A zero. This fact should go some way 
towards answering the philosophical question concerning the significance of the cosmical 
constant. 

5. A regraduation must first be considered as a local affair. If X a are local coordinates 
and m represents mass at an event E, then a regraduation at E is a transformation of the 
form 

X° = «X°, X x = *;X\ m=wm (10) 

for some positive numbers u, v, w and all X a , m. If new values arising from this regraduation 
are denoted by a bar, then we have at once c — cvu ~ 1 , p =pwv~ z ,f k * —pfwv- x u~ z y and from (1), 

A 0 * = irtV, A*< - *r*Afc<. (n) 

Hence from (5), T 00 = z^zr^T 00 and T AX = wv~ x u~ 2 T xx , so that from (n) and the definitions 
of gijy T ij and T^, we find 

gii = = ^~ 3 ^“ 2 T i3 ', T# = (1 2) 

If now a regraduation is supposed to take place at every event, then u , w exist for all 

x 1 and we have equations of the form 

iij= eia Za’ (13) 

relating old and new tensor fields. Here, cr and iff can be functions of the x’s for we do not 
presuppose that u , v, w are the same for all events, i.e. u , v, w may be functions of the x’s. 
The relations g rij oc g tj are not surprising when we remember that null directions in V 4 are 
significant physically and must therefore correspond to null directions in V 4 , the space-time 
with metric 

ds L ~g u dx i dx>. (14) 

It follows from g u cc g if that null geodesics in V 4 correspond to null geodesics in V 4 , as 
required since these are light paths. 

The relations between old and new proper-times depends generally upon the world-line 
under consideration. Integrating along the same curve and between the same two events, 
the old and new intervals of time are jds and $ds~fe a ds. 

It must be remembered that a regraduation is not a coordinate transformation, the basic 
coordinates x i attached to an event remaining unaltered while local coordinates are changed 
.as in (10). We have seen that a regraduation changes the structure of space-time, V 4 being 
replaced by V 4 , but this should not be confused with the effect of a coordinate transformation. 
In (13), both old and new tensors are functions of the same basic coordinates, and the relations 
between old and new are not those which arise when coordinates are transformed. 

Considering now the physical system as a whole, a uniform regraduation is defined as 
one in which u, v and w are constants. Thus cr and if1 are constants, and fds~e a fds. Calcu¬ 
lating G ^ and G for V 4 , we see at once that if the old tensors satisfy (9), then the new tensors 
satisfy 

- kT # = G -# -gtf + \gij ( 15 ) 

where and A = Xe- 2 °. These are field equations, and we have simply shown that 

every system admits all uniform regraduations. 

As explained in § 4, there are in our theory no restrictions on u } v y w except those resulting 
from the field equations and from the geodesic hypothesis if this is adopted (see § 7). 
Excluding this hypothesis, then a physical system together with a basic coordinate system, 
tensors g i§ and T#, and constants^* and A satisfying (9) will be called a model (M), and we 
shall say that models M and M are equivalent if their tensors satisfy equations of the 
form (13). It is understood here that M satisfies (9) and M satisfies (15)? but it is not assumed 
that k and X are related in any definite way to k and A. If M and H are equivalent, then 



168 


A. G. Walker 


each can be obtained from the other by a regraduation, and, in our view, both are models of 
the same physical system. 

6. For non-uniform regraduations, v and w are not all constants, and certain frequencies, 
lengths, velocities, masses, etc., which are constant (as the event varies) in the old scales may 
become variable in the new. We thus get essentially different descriptions of the same physical 
system although no conservation laws are broken, these being obeyed in each model in 
consequence of the appropriate field equations. 

If for convenience we adopt a definable universal method of measuring some physical 
object, then regraduations are restricted accordingly. The simplest example of this is to 
measure length in relation to time so that the velocity of light has the same value at all events. 
This gives vfu = const, but does not restrict o* or iff. Another example is for fundamental 
observers to define scales of time by means of the frequency of a particular Fraunhofer line, 
so that u = const, and a =const. Still another example is to define a unit of mass everywhere 
by means of a recognisable elementary particle. This gives const., and when taken 
together with the li c” restriction gives iff- -cr + const. Other relations between a and ifs 
result from other restrictions. One of the assumptions made by McVittie is that iff — o, but 
no physical reason was given; an interpretation is that mass is everywhere measured in 
relation to time and length so that the Newtonian constant of gravitation has a fixed value. 

In the present paper we shall consider regraduations in general until § 9, when we shall 
impose the restriction ifs=iff(a) and later (§ 12), ip-m+a where n and a are constants. This 
last includes many of the interesting physical restrictions. 

7. According to the geodesic hypothesis, time-like geodesics in V 4 are paths of free 
particles and therefore significant physically. It follows that a permissible regraduation 
must be such that these same curves are geodesics in V 4 . The coordinates x i therefore give 
a geodesic correspondence between V 4 and V 4 , the conditions for which are known to be 
(Eisenhart, 1926, § 40) 

for some vector Xi, where jjJ, are the Christoffel symbols for V 4> V 4 respectively. 
From (13) we find 

{,*} ~{^} =S ^ +S * <or ^ ~Sni il(J >i ( a ’« = '£) ( l6 ) 

whence o must be such that 

8 fa*-giTcg il o,i = 8 /x*: +$kXi (17) 

for some Xi - Multiplying by g ik g ih and contracting, we find Xh = ~o, h . Substituting in 
(17), we get o-,i=o, so that a must be constant. 

When a is constant, = G# and from (9) and (15) we deduce that either iff is also constant 
or T***,. From (9), the latter implies G^=iGrand V 4 is therefore an Einstein space, 
with G = constant. This again leads to iff — constant, from (13) and (15). 

Without any substantial loss of generality we can assume that a regraduation for which 
<7 and ip are both constant is uniform.* Hence, the only regraduations consistent with the 
geodesic hypothesis are uniform . 

8 . For the remainder of this paper the geodesic hypothesis will be excluded, and a re¬ 
graduation is valid if (13) hold for some functions o*, i/r, the respective field equations being (9) 
and (15). From the first part of (13) we have f 

Gtf « Gtf + 2cr Hj - 2a n <T, i +g ij (A 2 ( T + 2Zl 1 cr), 
where a comma denotes covariant differentiation in V 4 , and 

__ Act (18) 

This is exact if the velodty of light is everywhere taken to have the same value, 
t Eisenhart, 1926, § 28. Eisenhart’s is our G i} . 



A Theory of Regraduation in General Relativity 169 

Substituting in (15), then from (9) and (13) we find 

(k - fee*) T iS = 2 a }ij - 2cr ii v^ + A x a + A- Ae 2a ). (19) 

A given model can therefore be regraduated if it satisfies (19) for some functions a , 0 and 
some constants k, A, and the new model is given by these constants and (13). 

Regarding (19) as equations for a and ip, then certain conditions of integrability must be 
satisfied, and these will ultimately give in explicit form the conditions which a model must 
satisfy if it is to admit regraduation. This general problem has not been considered, but it 
is clear from the form of (19) that not every model admits a non-uniform regraduation. This 
extends McVittie’s conclusions, which concern only the case */r = o, /c = k, A = A. 

We shall now consider only those regraduations for which iff = iff(a), and the models which 
admit them, because these are of some physical interest and because (19) become more 
tractable. 

9. We shall first prove in the case ifj — \fj(a), cr 4 = constant, that the surfaces cr=constant 
in V 4 are geodesic parallels, except possibly when ijs — - 2a-ha where a is constant. The 
condition for this is that A x a should be a function of <r, i.e. (A x d) 9i oc a, i9 this being necessary 
and sufficient.* From (19) we have 

( 4 ja),i = 2g ik a,j a,hi =(/c — KjJVfaf +cr,<(2/J 2 cr + 3/l 1 a + A- Ae 2<r ), 
so that it is sufficient to prove that T/cr, 3 * oc cr yi . _ 

From (8), which are consequences of (9), and from the corresponding equations in V 4 , 
we get 

^ - »*> +T K\ji }-{$) -- T -*({i} -{&}) -»• 

Writing iff x = difsjda 9 and substituting from (16), we get 

(0i + 2)T/<r, 3 - = Tor,*. ( 2 °) 

Hence T/a, 3 - oc a 9i except possibly when 0 X = - 2. 

When 0 X = - 2, then T = o since <y is not constant. Thus p 0 = o, and this case can only 
apply to a model which is either empty or contains only radiation. Even in this case it is 
possible that the above theorem still holds in consequence of the conditions of integrability 
of (19). In the following discussion, regraduations will be restricted if necessary so that the 
theorem holds when iff x ~ - 2. 

Another consequence of T/cr, 3 - oc cr n - is that g ij a :j is in a principal direction of T ij7 and is 
therefore along or orthogonal to the stream-vector at each event, the former being the case 
when g^cTyj is time-like. If this vector is space-like, then it at once follows that da- o along 
every stream-line and the temporal regraduation is uniform for each fundamental observer, 
although not necessarily the same for all such observers. Such a situation promises to be 
interesting, but is not the kind with which we are primarily concerned in the present paper. 
It will therefore be assumed that g ij a,j is time-like, and so is in the direction of the stream- 
vector. 

It follows from this and the first theorem that in a model which admits regraduations of 
the kind here considered, the stream lines are geodesics and the surfaces a —constant are 
orthogonal to them . We see therefore that basic coordinates t , of can be chosen so that the 
stream-lines are of =constant and so that the metric of V 4 takes the form 

ds 2 = dt 2 + g^dofdx**. (21) 

Every regraduation of the kind here considered now satisfies a—a(t) and hence also iff—iff (t). 

10. The physical systems which we intend to consider in detail are those in which pressure 
is everywhere isotropic. From the definitions of t, x K above and from (7), we have 

h^Bfy T 0 x “ O, Txfi « gxfx.* 

Also a n -=a / 8 i °, where a prime denotes differentiation with respect to /, and (19) gives 
oc g Ky _. But cr,^ = - = Wi\r, Whence = F a Slt for some F(Y, x) and a^(x). 

* This follows from Eisenhart. 1926, § 19. 



170 


A. G. Walker 


From (9) and (21), Tq X = o implies Gqx^o. Calculating Gqx, these equations become 
O, so that F is of the form f(f).g(x). Writing f{f) = -R a , and absorbing g(x) in 

dtoxr 

a Xfl , we finally see that (21) becomes 

ds* = dt 2 - RV^ (22) 


where R = R(^) and a Xft> are independent of t. 

Writing G x * for the contracted curvature tensor in the space V 3 with metric a^dx'dx*, 
then from (22) we find 

G X/a = G X( * + Oa XtL 

for some 0 . Substituting in (9) and remembering that T Xft « g Xfl , we have that G XjU , # « <2 X/x , and 
V 3 is therefore an Einstein space. But every Einstein 3-space has constant curvature 
(Eisenhart, 1926, p. 92), whence V 3 is a space of constant curvature. The metric (22) is 
now recognized as that of a Lemaitre universe, and we have 

The only physical systems in which pressure is everywhere isotropic and which admit 
non-uniform regraduations of the restricted kind ip — *p(ci) cere those described by Lemaitre 
models . 

By modifying R, the constant curvature of V 3 can be taken to be x, o, or -1, and it 
will be understood that this has been done. We shall now be concerned with Lemaitre 
models only, the elements of such a model being R(/), k, k and A, with T# given by (9) and (22), 
and p axidp by (7) with ^*=80*. We find, as is well known,* 

Kp = 3 R- 2 (R ' 2 + / 0 -A, <23) 

kP= - R- 2 ( 2 RR" + R' 2 + k) + A. (24) 

11* We now have 

CTtf-or'S* 0 , O^o^cr", cr,ox = 0, 

and 

A x cr=</ 2 , A 2 o=<j , ' + 3'R~ 1 RW. 


From (7) and (22), equations (19) therefore reduce to the two equations 

(k - K#)p - ~ 3 cr' 2 - SRfXBLV - A + Xc 2 *, (25) 

(k - ke*)p « 20" +a' 2 + 4 R~ 1 R'a' + A - Xe 2a , (26) 

From (7), T 0 °“p and T =p -3/. Hence (20) becomes 

0 /h+ *)/> + 3 /= 0. (27) 

This relation is not, however, independent of (25) and (26) and can be deduced from them. 

Eliminating ip between (25) and (26), we get 

2po n + (p + 3 /)cr' 2 + ( 4 p + 6 p) R- a R V + 3 (p + /)(A - Xe 2a ) = o. (28) 

This, then, is the only equation which o* must satisfy, and for each solution, tp is given by (25) 
or (26). Hence, 

Every Lemaitre model admits a non-uniform regraduation . Also: for a given Lemaitre 
model , a regraduation can always be found so that k and X have any chosen values (k > o), 
including X = o even when A 4= o. This proves our earlier statement that the cosmical constant 
in a Lemaitre universe can always be transformed away. It may not follow that it is always 
desirable to do so; a static model with non-zero A may, for example, be preferred to an 
equivalent non-static model with zero A. 

12. It is clear from the above equations that any further restriction imposed upon <7 and ip 
has the effect of limiting the class of Lemaitre models which admit such a regraduation. For 
example, McVittie’s assumption \p — o implies p + sp — o, and so rules out all Lemaitre models 
which do not satisfy this. It is, however, of interest to consider certain limitations in greater 
detail, and we shall now consider those Lemaitre models which admit the restricted re¬ 
graduations 

ip=nc + a, (29) 

where n and a are any constants. 

See H. P. Robertson, 1933, for a detailed account of Lemaitre models. Certain conventions there differ 
trom ours, e.g. as } 'which is a length in Robertson’s paper, is a time in ours. 



A Theory of Regraduation in General Relativity 171 

From (27) we have 

(n + i)p + 2 p = 0 , (30) 

this, then, being the physical interpretation of (29). We are thus restricted to the class Jf 
of Lemaitre models in each of which p and p satisfy a linear relation of the form (30). This 
relation gives the value of n, which is therefore known for each member of ^unless it is an 
empty model (p=p = d). Since the ratio pjp at any event is not affected by a regraduation, 
every model equivalent to a member of JT also belongs to Jf and has the sa.™* value of n 
associated with it. Some well-known members of Jf are the “/ = o” L emai tre models, 
given by n= -1, and the “radiation” (p 0 = o) models, given by n= -2. 

Substituting from (23) and (24) in (30), we get 


2RR" - «R' 2 + £A(« - 2)R 2 — nk—O) (31) 

which is therefore the equation for R corresponding to (30). Multiplying by R-n-iR.', this 
equation can be integrated to give 

R' 2 =juR” + JAR 2 - k, (32) 

where p. is an arbitrary constant. Differentiating and dividing by 2R', we get 


(33) 

This derivation assumes R'=#o, but (32) and (33) are still equivalent to (31) when R' = o, 
R and [x then being given by 


R 2 = 


Znk 

A(*-2)* 


2.k 




-R- n . 


(34) 


Substituting from (32) and (33) in (23) and (24), we get 

K p « 3^R n ~ 2 , Kp = - p(n + i)R n - 2 , Kp 0 = 3 fx(n + 2)R“~ 2 . (35) 

For a sensible model, p and p Q should not be negative, and p, if negative, should be very small. 
These conditions are satisfied only when pu > o and -2 < < -14- e, where e is small. 

We observe from (35) that ju-o is characteristic of the empty models which belong to Jf. 
For these, n is indeterminate, and (32) and (33) are still true but not (34). 

When then p~o and we have the well-known relation p °c R~ 3 for such models, 

giving conservation of mass. This is clearly connected with the fact that the restriction 
0= -cr-h# resulted from the definition of mass by means of elementary particles (§ 6). 

When n= - 2, then p 0 = o and /aR 4 , which is a well-known relation for radiation 
models. These models are presumably connected in some reasonable way with the restriction 
l/j — - 2(7 4 - 0 . 

13. In consequence of (29), equation (28) for o* must now be replaced by the first order 
equation (25). Substituting from (35), we get 

o ' 2 4 - 2 R- 1 RV 4 - [X R”~ 2 (i - Kr'ke™**) 4 - JA - \le ™« o. 

Writing 

5 = R^, p, = p i K'~ 1 ke a i (36) 

and substituting from (32), this equation becomes 

R 2 S / 2 - /lK * +2 4- £XR 4 - m. (37) 

Since only non-uniform regraduations are being considered, 0* is not constant, and the 
possibility R/R = constant is excluded. 

The appropriate space-time after regraduation is V 4 with metric ds 2 =e la ds 2 } and this 
with (22) suggests a change of a basic coordinate from t to /, where 

di—e G dt. (38) 

We then get _ 

dl 2 = dt % - R 2 a XfJ .dx K dx fX , (39) 

where R is given by (36). The standard Lemaitre form has thus been recovered for the 
new model. 



172 


A. G. Walker 


From (38), equation (37) can now be written 

W-*. (40) 

Comparing with (32), we see that pi, as defined in (36), is the new value of the parameter p, 
after regraduation. _ 

* Conversely, if M, M are non-static Lemaitre models satisfying (32) and (40) respectively , 
where p and pi have the same sign (or vanish together), then they are equivalent. The re¬ 
graduation from M to M is given by a = log R(7) - log R(/) and ip — ncr + a, where t and t are 
related by 


dt 

dt 

(41) 

m 

= R Iff 

a —log (tcpi/ftp). 

(42) 


This theorem applies also to the case when one of the models. is static (R = const.) except 
that (32) is replaced by (34) when M is static, and similarly for M, 

The above theorems are remarkably general, and show that in a Lemaitre model of 
class Jf, the only elements which ar.e invariant under all permissible regraduations are k, n, 
and the signs of k and /a. 

14. Two models of class Jf which are of particular interest are those given by Einstein 
and de Sitter. In the Einstein model, R = R e , a constant, and 

k=i, n=-i, 7i=|R e , A = R e “ 2 . (43) 

These satisfy (34), and we have from the theorem of § 13: 

Every non-static Lemaitre model of class Jf in which k~i,n~ -1, and p> o is equivalent 
to the Einstein model . 

The de Sitter universe is empty, and every representative Lemaitre model therefore has 
/x = o and n indeterminate. There are several alternative forms, corresponding to different 
values of k, this being due to the fact that the stream-lines are indeterminate. The static 
form satisfies 

h=i 7 ju, = o, A = 3R 0 - 2 , R = R 0 . 

Regraduations of a de Sitter model are uninteresting because they merely give other de Sitter 
models, though they may change the form from static to non-static, or change the value of R 0 . 

15. A fact of some interest which arises out of the theorems of § 13 is that we can sometimes 
regraduate a non-static Lemaitre model so that it becomes static.* Since a static model is 
certainly of class Jf, p and p being constants, it follows that the only models which can possibly 
be regraduated to become static are those of class Jf. 

Taking M as a given non-static member of Jf, then k, n and the sign of p are known, and 
relations of the form- (34) must hold for R, X and pi. Since pi must have the same sign as p, 
we see that pk(n-2) must be negative or zero. Also, from the equations leading to (34), we 
find that if either k or n ~ 2 is zero, then the other must be zero. 

Conversely, if phfn- 2) <0, or if k — o and n — 2, then M is equivalent to a static model 
in which R can have any assigned value. For equations corresponding to (34) can then be 
solved for X and pi, and the theorem of § 13 applies. 

16. It has already been mentioned that in the case of a Lemaitre model, a regraduation 
can always be found so that X = o. As an example of this and of the calculation described 
in § 13, let us regraduate an Einstein model so that X=o, 

In this case we see from (43) that h = i, -1, and pi can have any positive value. It 
will be found that R varies from o to pi, and as a matter of interest let us choose /2 = 2R e . 
Then (40) becomes 

^R/^7=?7(2R e K~ 1 -1) 1/2 , rj-±i. 

* This is not equivalent to the result given by McVittie at the end of his paper, because he was concerned 
not with regraduations but with transformations of basic coordinates. 



173 


Theory of Regraduation in General Relativity 


This shows that R now varies from o to 2R 4 , and making the substitution 
R = sR e sin 2 fi/2, (sR^R- 1 - i) 1 ' 2 = rj cot 9J/2, 

we get 

J=R e (^-sin<^) 3 (44) 

where the constant of integration is chosen so that t =o when R = o. Eliminating <£, we 
find R(?). 

We now have all the elements of the new model, k being supposed given. To find the 
regraduation which produces this model, substitute for R and R in (41). From (44), this 
•equation becomes and taking = o in the new model to correspond to /=o in the 

old, we find 


t — t- R e sin: 


(45) 


This, then, is the regraduation of proper-time for all fundamental observers. 

Substituting as explained in § 13, we find that the regraduation functions o-, i(t are given by 


e° = 2 sin 2 -4, cosec 2 (46) 

2R 6 2 AC 2R e 

We thus see that the Einstein model is equivalent to a non-static model in which A = o. 
This model oscillates with a constant period 27 rR e in both old and new time-scales, and the 
value of R varies between o and 2R e , so that the model periodically contracts to a point. 
The new density, p, is 6 K‘~ 1 R e R~ s i and thus varies between |/c _1 R 6 ~ 2 and co ; the old density 
is p = 2K~ 1 R e ~ 2 . 

17. One other regraduation of special interest is that which gives ac = ac, A=A, and p=p, 
so that a^o from (42). This includes the case considered by McVittie, in which also 72 = 0. 

It is easily verified that a more general regraduationJVf —* M in which X has the same^sign 
as A can always be followed by a uniform regraduation M SI so that finally /c=ac, and A = A, 
(and also c = c if desired). We cannot, however, impose the additional relation p=p, for 
it can be verified that under a uniform regraduation, pX~ n - pX~ n , whence p=p only when 
the first regraduation gives p=p(A/A) w/2 . Since uniform regraduations are trivial, we deduce 
that in the general problem of regraduation there is no substantial loss of generality if we 
assume that c = k~k, and A = o or A; we cannot,* however, assume that p=p. 

Returning to the regraduation mentioned at the beginning of this section, we now see that 
this is essentially special except when A = o and n 4= o. 

It will be assumed that M and M are non-static, and comparing (32) and (40) we see at 
once that the zero of I can be chosen so that R is of the form 


R(*>Rfy4 


(47) 


The relation between t and t is found from (41), which integrates to give 

7 =H- 1 (^H(/)+a), (48) 

where H'=i/R, Hr 4 (as) is the inverse function of H(#), and a is an arbitrary constant. 
Functions a, if/ are now given in the usual way, <7 being log R (rjt) - log ROO and therefore not 
constant unless 7] and a have particular values. 

When 77 = 1, we see from (47) that the two models are exactly similar. This means, not 
that the regraduation is trivial, but that it is equivalent to a correspondence of V 4 with itself. 
Regarding (48) as defining a correspondence P —P in V 4 , then after regraduation, the new 
description of the physical state at P is identical with the original description of the state 
at P. 

An illustrative example is the case 72 = 0, A > o considered by McVittie. Writing A=3a) 2 , 
then from (33) and (32) we find 

R(/) = Ae + B<r wt , p - k - 4ABC0 2 , 

, where A and B are constants. To obtain (48) generally it is necessary to consider several 
different cases, and we shall confine ourselves to the one given by McVittie, which is, in our 

* The case when A=o is special; A is then arbitrary and we can assume p=p except when 72=0. 



174 Theory of Regraduation in General Relativity 

notation, R=A<? wt . This McVittie describes, rather misleadingly, as of de Sitter type; it is 
actually a de Sitter model if jU = o, i.e. if k — o. We find H(/) = - and, from (48), 

?= log G]e^ x +j8), 

where j 9 is a constant replacing a. We therefore have 

7j = i: <7= -log(i+j8* w< ); 

7] ~ — 1: cr = log ( — 

When 7] = 1 and is small, a is approximately - fie 0 **. -This is the approximate result 
given by McVittie, who considers only infinitesimal regraduations and so does not obtain 
results corresponding to rj = -1, these having no infinitesimal member. 

18. This concludes our study of regraduations. We have defined them, and have shown 
that they are admitted by many systems of General Relativity when that theory is interpreted 
in a certain reasonable way. In this interpretation, the status of the cosmical constant is 
changed—it is in fact no longer cosmical in the sense that it is the same for all systems. 

Certain restricted regraduations have been considered in detail, and also certain physical 
systems which admit them, particularly the Lemaitre models. Special regraduations of 
interest are those which turn a non-static model into a static, those which transform away 
the cosmical constant, and those which leave constants k , A, unaltered in value. Results 
given by McVittie occur as special cases of these last regraduations. 

Finally, it is clear that many problems still remain. There is the study of physical systems, 
with space-time (21) but in which pressure is not isotropic, and there is the singular case 
tfj 1 = - 2 in § 9. Deeper than these is the problem of finding what physical systems admit 
regraduations for which ip is not a function of cr. 


REFERENCES TO LITERATURE 

Eddington, A. S., 1930. The Mathematical Theory of Relativity , Cambridge. 

Eisenhart, L. P., 1926. Riemannian Geometry , Princeton. 

McVittie, G. C., 1945. “The Regraduation of Clocks in Spherically Symmetric Space-times of 
General Relativity,” Proc. Roy. Soc . Edin., LXII, A, 147-155. 

Robertson, H. P., 1933. “Relativistic Cosmology,” Reviews of Modern Physics, V, 62-90. 

Synge, J, L., 1934. “The Energy Tensor of a Continuous Medium,” Trans . Roy. Soc . Canada* 
xxviii, 127-171. 


(Issued separately March 21, 1946) 



Evaluation and Application of Certain Ladder-Type Networks 


*75 


XX.— Evaluation and Application of Certain Ladder-Type Networks. By W. E. 
Bruges, M.Sc., A.C.G.I., D.I.C., Assoc.M.Inst.C.E. Communicated by Professor 
M. G. Say, Ph.D., M.Sc. (With Seven Text-figures.) 

(MS. received March io, 1945. Revised MS. received July 12, 1945. Read July 2, 1945) 

This paper has for object the calculation of a ladder network, using trigonometrical functions 
of real multiples of ( -1)£ which, in many cases, simplify practical formulae. The work was 
prepared particularly for application to transmission lines, conductors in electrical machines, 
and isolated cylindrical conductors. The effect of the conjunction of two or more dissimilar 
networks is considered, leading to a method of assessing the impedance of a conductor of any 
shape embedded in an open slot cut in highly permeable material. 


1. The Impedance of the Ladder Network 


The circuit in fig. 1 comprises (n -1) impedances 8% in series, LN, connected uniformly 
through n impedances 8z 2 to a conductor HK which may sometimes be identified with “ earth”: 
8 ^ and 8z 2 are vector quantities of the form A+/B, A and B being scalar. The vectors 
v d v 2> • • v n represent the potential differences between the junctions of the series 
impedances and the conductor HK. The vectors i l9 i 2 , i Z) • . ., i n represent the currents in 
the parallel impedances 8z 2 in order. I n is the vector current flowing into the network at N 
and out at K, N being at the end of the (n - i)th series impedance 8^ remotest from L. We 
then have 

S* 2 (4 - 4-0=- »«-i=S%0i+4+4 + • • • +4-0, 

also 

V n ~8z 2 ,t 1 J r 8%^! + (^+ 2g) + ( 2 l+ ^2 + 2 3) + • • *]> 


and if 8zJ8z 2 

=ift } then 



where 

4S„=4+24-i + 34-2 + • • • + *4 

z l®«—1 = z n—1 + 2z n—2 + 3 z n—3 + • * • + (* — I ) 2 1? 

and 

4-l = 4( I +*/ , S n _ 2 ), 

4-2= z i( I +^ S n-3), and so on. 

Further 

4 =Z 1 + z 2 + z 3 + • • •+ z n = z l(Sn~ S«-i) 

and 

=n + ^(Zn—i + ®»—2 + • • • + SO — U n 

with 



S X =I. 


(*) 


Substituting the values so obtained for i n , 4 -u 4-2 * • • *** the series for z\S n and dividing by 
we have 


' * ' S ft = ^ n + ^(S n _ 1 + 2S„_ 2 + 3S n _3+. . .+83), 


where ^-1 + 2 + 3+. . .n; similarly 

4-i “ 1 f" ^(S»-2 + 2S„_3 + 3S n _i + . . .+S0, 

and so on. Repeated substitution of this recurrence relation gives 

S„=^o+ $'!’/'+?2'/< 2 + • • •+$ r n-i ! /' n_1 > & 



iy6 W. E. Bruges 

where = i + 2 + 3 +. . . + u, and q x is obtained by multiplying each term of this arith¬ 

metical progression by the sum of a similar progression to 1, 2, 3, . . . (n~i) terms, then 
adding the terms; q % is obtained by multiplying each term of q x so obtained by the sum of the 
same arithmetical progression to 1, 2, 3, . - . in - 2) terms, and so on. In practice these 



Fig. 1. Fig. 2. Fig. 3. 

Ladder networks. 


terms may be obtained by writing down the progression, first forwards then backwards, 
moving on the multiplier series a term at a time, multiplying only those terms directly under 
one another, and adding: 

+ 2 K-i + 34i- 3 •. • • to (n -1) terms, 
and 

^2 == 4 t—2 4 * (1.2 4 - 2.i) 4 ;_g + (1.3 + 2.2 + 3 *i)4i- 4 * - . to (« - 2) terms, 
the series in brackets being obtained by multiplying 

14*2 +3 +4 +. . . 

by ... 4* 3 + 2 4* I, 

moving the lower series a term at a time to the right. To obtain q z we write down the series 
so obtained and repeat the process: * * ' 

. ,(1)4(1.242.1)4(1.342.2+3.1) + , . . 

• . . 4-34-24-1, 

whence ■ „ . . 

, Zz^K-z 4 *(1.1.2 4 -1.2.1+2.1.i )^ n _ 4 

+ (l.I.3 + I.2.2 + I. 3 . I + 2.X.2 + 2.2.I + 3.X.l)4_ 5 . . . 

to («— 3 ) terms, and so on. 

!The impedance of the network between. HK and N is conseq uently 


(3) 








Evaluation and Application of Certain Ladder-Type Networks 

7 __ Vn _ ^2 i.8g a (l+^fS w --3) §gg( 

2 l(®n Sn-l) S n — S n _ x U n 

Tables are given of the coefficients of ip in S n and U n for values of n from i to 15. 


Table I.—To Find S n 


Coefficient 

of 

0 

** 


0 1 

0 s 

0* 

r 

0* 

tjjQ 


^11 

^12 


^14 

« = 1 

I 





j 










2 

3 

1 














3 

6 

5 

1 


1 











4 

10 

15 

7 

1 












5 

15 

35 

28 

9 

I 











6 

21 

70 

84 

45 

II 

1 










7 

28 

126 

210 

165 

66 

13 

1 

1 


1 



; 


1 

8 

36 

210 

462 

495 

286 

91 

15 

11 








9 

.45 

330 

924 

1,287 

1,001 

455 

120 

17 

1 







10 

55 

495 

1,716 

3.°03 

3,003 

1,820 

680 

153 

19 

I 






11 

66 

715 

3,003 

6,435 

8,008 

■19,448 

6,188 

3,060 

969 

190 

21 

I 





12 

78 

1,001 

1,365 

5,005 

12,870 

18,564 

11,628 

4,845 

1,330 

23 1 

23 

I 




13 

9i 

8,008 

24,310 

43,758 

50,388 

38,760 

20,349 

7,315 

1,771 

276 

25 

I 



14 

105 

1,820 

12,376 

43,75S 

92,378 

125,970 

116,280 

74,613 

33,649 

10,626 

2,300 

325 

27 

I 


15 

120 

2,380 

18,564 

75,582 

184,756 

293,930 

319,770 

245,157 

134,596 

53,130 

14,950 

2,925 

37S 

29 

I 

Sign* 

+ 

+J 

- 

-j 

+ 

+/ 

- 

J 

+ 

+/ 


-J 

+ 

+ / 



Table II.—To Find U n 


Coefficient 

of 

0 

0 a 

0 s 

0* 

0 5 


0 7 

0 s 

0* 

^10 

xjP- 1 

^12 

^13 

n = 1 

1 














2 

2 

1 













3 

3 

4 

1 












4 

4 

10 

, 6 

I 











5 

5 

20 

21 

8 

I 










6 

6 

35 

56 

36 

10 

I 









7 

7 

56 

126 

120 

55 

12 

I 








8 

8 

84 

252 

330 

220 

78 

14 

I 







9 

9 

120 

462 

792 

7 i 5 

364 

105 

l6 

I 






10 

10 

165 

792 

1*716 

2,002 

1,365 

560 

136 

18 

I 





11 

11 

220 

1,287 

343 * 

5,005 

4,368 

2,380 

8l6 

171 

20 

I 




12 

12 

286 

2,002 

6,435 

11,440 

12,376 

8,568 

3,876 

1,140 

210 

22 

I 



13 

13 

364 

3,003 

11,440 

24,310 

31,824 

27,132 

15,504 

5,985 

*>540 

253 

24 

I 


14 

14 

455 

4,368 

19,448 

48,620 

75,582 

77,520 

54,262 

26,334 

8,855 

2,024 

300 

26 

I 

15 

15 

560 

6,188 

31,824 

92,378 

167,960 

203,490 

170,544 

100,947 

42,504 

X2,650 

2,600 

351 

28 

Sign * 

+ 

+J 

- 


+ 

+ / 

- 

-/ 

+ 

+/ 

- 

-/ 

+ 

1 

+/ 


* Applicable where rfi is of the form jxjr. 


To extend the tables downwards for ip x , sum the figures in the column for ifr* 1 and add 
the bottom existing figure in the ip® column. 

In the limit, when n is large, the process described above for evaluating the coefficients 
q 0i q l3 ^ 2 , . . . is equivalent to a successive double integration of these coefficients: thus 




,*n r>n 

(with n large) approaches j I dn*dn y 


a n 

g 0 .dn.dn^n^j^i, q 2 approaches J j q 1 .dn.dn=n G / 6 l ] etc. Hence when 

n is large, 

n 2 n*, n $ ln 

Sn= 7 \ + A' + 6 ^ + - • •’ 

and S n - S n _ x approaches the differential coefficient of S n with respect to n 3 z.e. 



i 7 8 


W. E. Bruges 


0S fl* 

S„ - S n _! approaches —" = «+-<// + -^ 2 + . . . (treating >p as constant), 

n 2 n 4 

' and similarly U n - approaches i + ~ir + + • • * 

The same results may be obtained by summing the coefficients of 0 , iff 2 , ift 3 ... in turn 
by the method of differences, and taking the limits when n is infinite. A rigid proof is possible 
in this way. 

Suppose that the impedance NL is uniformly distributed over a length I n : then the series 
impedance per unit length is 

% = ».Sa 1 // n , (5) 

and the parallel impedance per unit length is 

z 2 = / n .S gjn. (6) 


Writing 9 — n 2 ift = l n 2 . zjz 2 and z x z 2 = 8 z x . 8 z 2 we have, for n large: 


that is, 


i + i[fS n approaches i + (0/2 !) + (0 2 /4!) + . . 
lim (1 + *pS n ) = cosh V 9 . 


Similarly 
that is. 


y^A(S n ~ $n-i) approaches y /9 + ( 0 \/ 0/3 0 + ( 0 VW 5 0 + * 


lim yty(S» ” $n-i) * V 4 * * u n = sinh V0. 

n~x» n->oo 


Also 


and when ^ approaches infinity 


lim (U n - U^) = cosh V 0 , 

n->oo 




Z n = lim 

n-^co 


Sg a (i +|/>S„_ 1 ) 

u. 


= V %g 2 —— =V coth V0. 
sinh V 0 


(7) 


( 8 ) 

(9) 


(10) 


If M is any point in LN, where LM = /, including (m -1) elements 8% and m elements 8 z 2 , 

®m/»«= 4 / 4 =( I +<£S m _ ] )/(i +^S„_ 1 )=cosh y'ejc osh V# (n) 


with and «infinite: here is the potential difference between M and K. Thus conditions 
in any part of the network may be found. 

The impedance Z n when n is infinite may also be found by considering conditions at the 
point M, where v m = z 2 -dlldl > so that z 2 (dl is an element of parallel impedance corresponding 
to an element of parallel current dl (dl corresponds to the shunt current i m in fig. 1). Further, 
=!%.<//, where I is the current at the point M in impedances 8 z v Thus 


giving 


d 2 v m /dP = z x . dlj dl—v m . zjz 2) 


(12) 


in which A and B are constants to be determined from limiting conditions. Since, however, 
the same result should be obtained whether / is positive or negative, it may be inferred that 
A=B; hence 

v m cosh 4 tV (z x jz^) cosh <\J 9 m 

v n cosh 4 V(%/^ ~ cosh V0n ^ ^ 

and 

ptn m n 

I= Jo^Jo ( v l z * )dl=(Pn sinh V 9 JIW(.H z a) cosh ^ 9 „). 

Hence 


~ ®nl^-n “ \/ (^1^2) COth 


(io) 



Evaluation and Application of Certain Ladder-Type Networks 179 

It is convenient at this point to notice a case in which the network parameters vary inversely 
as some function of l. Suppose that instead of z x and z % we have zj<j>(l) and %/<£(/). Then 
v~z%.dll<f>(l)dl and dv = lz 1 .dlj<f>(l) J so that 

d 2 v z x dl <f>'(l) z ± cf>'(l) dv 

dl 2 ~ $(l) dr l \</>(l)¥~ z” ~^* 7 l’ (l3) 

and if <£(/) = /, for example, we have <£'(/) = 1, and 

d 2 v 1 dv z x 

dp+rmr 0 ’ {lza) 

After the removal of the constant zjz^ by substitution, this becomes a Bessel equation of zero 
order. 


2. The Impedance of the Terminated Network 

We now consider the effect of replacing the impedance hz 2 across KL by an impedance 
Z £ as shown in fig. 2. Zb may be taken if needed as representing a second network, con¬ 
tinuing the first but having different parameters. We now have 


also 

and 

where 


8*2(4~4-i) = v n ~v n .i = Ss x (4 + 4+ - • • + 4); 
v n — Zjs4+§^[4 + (4+4) + (4 + 4+4) + * • •] 
4 =2= 4 [( z a/S* 2 ) + ^T n _ x ], 

4 T n = *n + 2 4-l + 34-2 + * • • + m l 


as for 4 S n , but differing in terms of t\. Again 


and 

with 

Thus 

Similarly 


4-i 3= 4[( Z 2*/S* 2 ) +V !fT n- 2 L and so on; 

In “ 

"I \ — ^ 4-1 83 (^8*2) + 0(T n _x + T w _ 2 + • • - + Tj), 

Tx-i. 

, T w a=^ n _ 1 (Z jB /82r 2 )+^ + 0 (T w _ 1 + 2T w-2 + 3T w _3+. . -+T X ). 

T^x = ^w-2( Z ^/8* 2 ) + » + ^(Tn-2 + 2 T n -3 + 3T n _4 + . . . + Tx), 


(14) 


and so on, and by repeated substitution of the recurrence relation we obtain a series as before 
of the form 

- y 0 + + < 2 iV + • • • + 


differing only from the series previously obtained for S n in that we replace h n by 
f^«-i( 2 W 8 *a) + »], > 4 -i by [>4-2( z i?/8* 2 ) + (» -1)], and so on. Also 


that is, 


T n — + (S n - S n _x), 

T n -(Z £ /8* 2 )Sn-i + U n . 


(X 5 ) 


The impedance between HK and N is consequently 

v„ $z 2 [(Z b I 8 z 2 ) + ^T n _ x ] 
T — T 9 

x n A 7 i—l 

which, when n approaches infinity, becomes 


7 -~- 

“ t “ 


(16) 


_ r Ml ++s *i( s «-i ~ s «-2) z* cosh ye +y^a) sinh ya 

" n™ (ZMCSn-! - S n -a) + (U„ - U„) (ZbIVw*) sinh V0+cosh y» * 



180 W. E. Bruges 

Conditions in the network can be found from 

’Tm-l CQSh \Z6 m + '\/ (gij-g) sinh ’s/Qtm ^ 

v n Zjj 4 - 8 z x . T n _ x “ cosh V 0 » + V(*i*a)sinh ' 

If K and L are joined so that Zb-o, then 

vjln = VOVa) tanh V®' (I 9 > 

The impedance between points N and L in fig. i could be found by joining the centre 
points RR\ The impedance between R and either N or L is V (h z s) tanh. \l n \/(z x z 2 ), and 
the total impedance of these two sections in series is twice this value. But the connector 
RR' may be removed because R and R' are equipotential points under these conditions* 
The impedance N to L is consequently 

Z A = 2V( z i z z ) tanh IfnV&il**)- ( 20 ) 

Measured from equipotential points where Z B may be considered zero, 

vjv n =sinh V^m/sinh <y/d n . (21) 

The impedance Z n between K and N, fig. 2, may alternatively be calculated as follows, 
when n is infinite: eq. (12) applies, but the constants A and B are different and are no longer 
equal. When /=o 

0 x = 4 Z J B* a 'KA + B) and v m = z x Z B e W(ZllZi) - B sinh ZV( z il z 2)- 

Again 

I=J^I=/(»/*)#- [hZ B IVMle w( ^ - [B IVM 1 cosh V(»i/ 4 ). 

For /= o, 1 = 4 : 

4 = ftZjs/VX*!**)] - B/vWs), and B = 4 [Zj - vWs)], 

so that 

v=i 1 Z 3 e W( -^-z 1 [Z B --\/(z 1 z t )] sinh V(V^) 

=4[Zj cosh V( 2 i/ Z a)] +V(a 1 2 2 ) sinh 

Also 

I “' i [vo si ” i, 7% +cosh/ \/i’ 

Hence the impedance between L and N is 

7 Zb cosh yf? + V(jSi^ 2 ) sinh \/8 

n h [ZjbIV( z i z 2)] sinh yfl + cosh v# 

If Zb represents a network having parameters z x , z 2 and 8such that z x z 2 = z x z 2 , then 

Z n —V(^2) c °th (V# + V^O • (x 70) 

3. Ladder Network with Superimposed Load 

In fig. 2, Zb may be considered as a “load” impedance applied across the termination of 
the network of fig. 1, n being large. Alternatively we may place a load impedance Zb across 
the terminals K. and L as in fig. 3, when the impedances 8 z 2 and Zb, being in parallel, may be 
manipulated together as Zb in fig. 2. Now in certain practical problems Zb is unknown, 
but the load and “leakage” currents are measurable. Consequently it is convenient to 
eliminate Z L and provide expressions in terms of these two currents. 

Suppose the load current to be {k - i)I n , where k is any quantity of the form A+/B. I n 
is the leakage” current as before, and the total current entering the network at N is now 
< fi S- 3 )- We thus have 4 + 4 + 4 +. . .+ 4 = j n as before, and i'Z B ^z\.8z 2 where 
i =(£-i)I n +4, whence 


’(*-*)!»+ 4 ' 


(22) 



Further 


Evaluation and Application of Certain Ladder-Type Networks 


181 


Therefore 


V=*'(T« - T„-i) =2 '((Z s /Ss 2 )(S b _ 1 - S M _ a ) +(U B - U^)) 
= lim (zi/V'A) sinh V# + [(£ - i)I„ + 2'i] cosh \/d. 

n—>oo 


- lim 

n -> oo 




(£ - i)[(i/V^) sinh yfl-f- cosh V#] 


k - (£ - i) cosh 
f VXSgi. §g 2 )[/S - (j£ -1) cosh yff] 1 

'L (£-*)sinh VQ +J 

V( z i z ?)[k ~ “ x) cosh V^] 

(i - i) sinh V# ’ 


+ i 


(23) 


since k's/^ — k's/d/n-^ o as n— > 00, Thus the impedance of the network between K and N 
and terms of k and l n is 

_ V(2 x z 2 )[k ~(k~i) cosh Vg] cosh V# + V(%g 2 ) sinh V d[(k -1) sinh \/#] 
n [^ - (^ - 1) cosh Vfl sinh \/d + [(k - 1) sinh \/d] cosh y/d 

Vn V(%g 2 )[^ COSh Vd-(£- X)] 

k sinh y /6 


As before, if v m is the potential difference between K and any point M in LN, 

vjv n = [Z B cosh y/d m + V( z i z z) sinh y/^mVl^B cosh y/d 4* V( z i z i) sinh V0] 
V0h* 2 )[>$ ~ “ I ) cosh y/d] x . ... 

(1 -1) sinh yr^~ cosh V m 

V(*i*sP ~ 0* “ 1) cosh V^J , /a , ,, \ . , 

- (/4 _ l)sinhv0 -cosh ye+VM smh ye 

k cosh y/d m -(k~i) cosh (y/d- y/d m ) 
k cosh y/d - (k -1) 


The voltamperes absorbed by the network can be found by subtracting the load voltamperes 
from the total input at HN, giving 


rl y ,V(M[k cosh ye-(k- 1)] 

k sinh \/9 




(26) 


Putting y/d m -o in eq. (25), we have 


v n [k~(k~i) cosh yg] kl n [k-(k-i) cosh ^ 6 } 

Vl k cosh 's/d - (£ -1) V\ z i z 2) sinh y/d 


using the value of v n obtained from eq. (24). Hence, substituting for v x in eq. (26), the active 
and reactive voltamperes absorbed by the network, exclusive of those taken by the load 
impedance Z L , become 

^nW( z i z 2)[( 2 ^ 2 - +1) cosh y/d ~ - i)]/sinh V#- (27) 

This result enables the “no-load” and “load” voltamperes to be compared for various load 
currents. 


4. Applications 

The application of the ladder network to transmission lines is well known. The develop¬ 
ment above is particularly applicable to cases of discontinuity such as the series connection of 
overhead and cable lines. Any number of such lengths can be added by use of eq. (17). 
From eq. (27), line parameters may be chosen such that the magnitude B in k =A -fyB vanishes 

P.R.S.E.—VOL. LXII, A, 1945-46, PART II *3 



i§2 W 7 . -S- Bruges 

for a given frequency, or approximates to zero over a given frequency band, so as to bring the 
sending- and receiving-end currents into a co-phasal relation. 

The networks in figs, i and 2 may be used as equivalent circuits representing either an 
isolated conductor or a conductor in a slot cut in a mass of highly permeable material (a case 
covering all rotary electrical machines of the electromagnetic type). 

The circuit of fig. 1 applies to the case of a rectangular conductor m a rectangular slot , 
fig. 4, A. Here HK, in fig. 1 represents one end and N the other end of the conductor at the 
points of entry into the slot. LN = 4 is the depth of the slot occupied by the conductor: 
elements 8 z x ~jhx comprise pure inductive reactance, and elements 8 2%~8r are purely 



resistive. The circuit demonstrates the building up of reactive electro-motive forces and the 
tendency to force the current to the upper parts of the conductor. For this case 


and 


z i —. J x —j 4 'n(c/d , ) 27 rf, 10” 9 

(28) 

z 2 =r=pc/i, 

(28^) 


where p = resistivity of the conductor material, b = width of conductor, = width of slot, 
4 « depth of slot filled by conductor, c = length of slot, / = frequency in cycles per see. Further 

8 x=xlJ(n - i), hr = m/l n , and 6 = Ifjx/r. 

The reactance is based on the usual assumptions that (a) the magnetic flux passes across the 
slot in straight lines normal to the sides of the slot, and ( b ) the magnetic permeability of the 
magnetic material is infinite. The impedance of a single conductor in such a slot will 
thus be Z n = y (jxr) coth 4V (J x I r ) : writing as scalar values a = 4V(#/0 &nd j8 — <\/(xr), then 

= PVj - coth a Vj (29) 

o cosh aV/.sin q,y j 
"^sinh a V/* sin a^/j 


fi sinh ay" 2 + sin aV 2 +./(sinh ay/2 - sin a\/ 2) 

V 2 cosh ay'2 - cos ay 2 ( 2 9 a ) 

The real part of this agrees with a well-known result (Field, 1905), but Field does not give the 
imaginary (reactive) part, which is of great practical importance in the design of certain 
electrical machines. 

The circuit of fig. 2 applies to a conductor of stepped section in a slot of similar shape, as 
shown in fig. 4 ? B. Although only one step is indicated, any number of steps may be dealt 
with by application of eq. (17), Z B being first found from eq. (29) or (290) for the lowest or 
innermost step using b y bp, 4, in place of b , b\ l n , fig. 4, B. The composite impedance will be 

z _ - cpsh qyy + jgy/. sinh a\lj 

n (ZbIPVJ) sinh a\lj -f cosh a\lj 

a in this case is found by use of the dimensions of the next step, e.g. b 2 , b %, 1 %, in fig. 4, B. If 
there is a further step, the value in eq. (30) becomes a new Z B) and the process is repeated. 



Evaluation and Application of Certain Ladder-Type Networks 183 

It will be seen from figs. 1 and 2 that the current density in that part of the conductor 
nearest the slot-opening will be v n fpc: hence the current density in any part of such a con¬ 
ductor may be found after evaluating the impedances of the first, first and second, first, second 
and third . . . steps in the manner described, using eq. (18) and working downwards from the 
slop-opening. In eq. (18) v m and v n are proportional to the current-densities at the points 
M and N. By putting 0 m = o we find the current-density at the bottom of the step nearest 
the opening, which will be the same as that at the top of the next lower step, and so on. 

In the case of fig. 4, C, where a conductor of arbitrary section is embedded in, and conforms 
proportionally to the shape of, a slot in highly magnetic material, let the width of the con¬ 
ductor be b<f>(l) at a distance l from the bottom of the slot, and let that of the slot itself be 
b'<f>(J). Then, from eq. (28) and (2 8a), Zx^jxf^l) and % = ?'/<£(/). Result (13) then applies, 
and the solution depends on the nature of (f)( 1 ). For example, in the case of a triangular slot 
with the vertex remote from the slot-opening, z x —jxjril and z 2 — rf A/, where 77 and A are con¬ 
stants: we then have <f> r (l)l<f>(T) = 1//, and eq. (130) applies. 

In most cases an approximate solution can more readily be obtained by dividing the 
conductor into rectangular sections and adding, using eq. (17). 

For an isolated cylindrical conductor of unit length, using eq. (28), c — 1, b' — b — 2ttI, 
where l is the radius, and eq. (130) is applicable. We consider the cylinder as divided into 
elemental concentric cylinders having sections zrrl.dl: and further in the case of a hollow 
tube we may consider a finite number of sections and so obtain the approximate impedances 
and current-densities, from eqs. (17) and (18). This may be quicker than the more exact 
method using Bessel functions, especially if the hollow cylinder comprises a number of thin 
cylinders of materials differing in magnetic and conductional properties. 

Fig. 5 is drawn to represent conditions in a rectangular slot in magnetic material where a 
number of insulated rectangular conductors , not transposed within the slot, each carry the same 



load current. The Mi conductor (counting upwards from the bottom of the slot) is shown 
with part of the next, the (k - i)th conductor below it. By mutual induction an electro¬ 
motive force will be developed in any conductor by the currents in all those beneath it. The 
effect of the currents l n in the (k -1) lower conductors is superposed on the reactances hx of 
the equivalent circuit for the Mi conductor. The formulae already obtained are now applicable. 
Here 9 = l 2 jx/r } where / is the depth of a single conductor: x, r, a and # are found as before 
from eqs. (28) and (28#). If v n is the potential difference between K and N, and v ' as shown 
in fig. 5 represents the electro-motive force induced between the (k — i)th and Mi conductors, 


then from eq. (27) the voltamperes absorbed are 

, (2k 2 -2^ + 1) cosh q-y// - 2k(k - x) 
sinh a*fj 


l n (v n +v')=i n *fSVr 


(3 1 ) 



184 W * E " Bruges 

\k(k - i)(cosh a/V 2 “ cos a/V 2)[sinh a/V 2 ~ s ^ n a /V 2 +y(sinh a/\/3 4 sin a/\/2)] 
I^| + sinh a\/ 2 4 sin aV2 4- /(sinh a/y/g - sin ay^) 


• (3™) 


V 2 L cosh a\/ 2 - cos oaJ 2 

The real part of this agrees with that obtained by Field, who does not give the imaginary part. 


The current-density in that part of the kth conductor nearest to the slot opening is vjpc, 
obtainable from eq. (24) for any given current l n ; and that in any other part of the conductor 
can be inferred from eq. (25), The conductors may be in series or in parallel. If the latter 
they will normally be transposed so that there will be the same current in each. 

Often a mean result for all the conductors in a slot is required rather than for the conductors 





Evaluaiion and Application of Certain Ladder-Type Networks 185 

individually. A mean value can be had by making k— 1, 2, 3, . . . k in succession in eq. (31), 
adding the separate results and dividing by k. The mean impedance per conductor is then 

v n + v' 0 , ( 2 ^ 2 + *) cosh ay/j - ( 2k 2 - 2) 

In ^ 3 sinh a-\/ j 

It is to be observed that the impedance so calculated does not include reactance resulting 
from magnetic leakage flux across the slot-opening above the conductors, nor from overhang 
leakage. These effects are obtained and added in the manner well known to designers. If 
the conductors, not being transposed, comprise a number of insulated strips joined together 
at the ends of the slot, or elsewhere, an allowance must be made when calculating r in eq. (280). 

5. Trigonometrical and Hyperbolic Functions of Real Multiples of (-i)£ 

In many practical cases, such as that above, a ladder network of the form shown in fig. 1 
may have all the series impedances 8% pure resistance and all the parallel impedances Ss 2 
pure reactance; or vice versa. It is evident from eqs. (29), (29 a), (31), and (31 <2) that solutions 
of such networks can be more conveniently in the form of hyperbolic or trigonometrical 
functions of real multiples of (- i)i than as ordinary functions of this type. Tables of these 
functions are not readily available in a convenient form for general use, vide Funktionentafeln , 
Jahnke and Emde, 1938; or Complex Functions and Atlas , Kennedy, 1900. The curves in figs. 6 



0 1 2 3 4 9 

Fig. 7. —Cothinj 8 . 

To find cotinj 8 , change sign of Imaginary part: e.g. cotinj (1) = 1*022 -70*331. 


and 7 are therefore given, from which, for many practical purposes, sufficiently close values 
may be obtained. 

In order to simplify nomenclature it is suggested that the functions following might be 
written in the form shown immediately under each:— 

(i/v» sin OVj (i/Vi) s ^h 0y>* cos d^Jj cosh 8 \lj (i/VJ) tanh 8 ^/j \/j coth 8 a/ j 

sininj 6 sinhinj 8 cosinj 6 coshinj 8 tanhinj 8 cothinj 8 






186 Evaluation and Application of Certain Ladder-Type Networks 

and similarly for other functions. We then have 

2-1/2 sininj 6 . coshinj 6 — sinh 0V 2 + sin 2 +j (sinh 9 *\/ 2 ~ sin dsf 2), 

2^2 sinhinj 6 . cosinj 6 = sinh 9 \/2 + sin 9 \l2 -j (sinh 8 \/2 - sin 6 \l 2), 

2 sininj 6 . sinhinj 9 = cosh 9 \l2 - cos 8 \/ 2, 

2 cosirij 6 . coshinj 0 == cosh 6 \l 2 + cos 6 \l 2, 

and a number of other formulae analogous to the ordinary trigonometric relations. 

The author records his thanks to Mr H. E. Clapham for the original suggestion that the 
ladder network could be applied to the case of a slot-conductor; to Professor M. G. Say for 
help in compiling the paper; and to a referee for several suggestions for shortening the proofs 
of certain expressions. 


(Issued separately March 25, 1946) 



Tables of Chebyshev Polynomials 


187 


XXI.— Tables of Chebyshev Polynomials. By C. W. Jones, M.Sc., and J. C. P. Miller, 
Ph.D., of the University of Liverpool, and J. F. C. Conn, D.Sc., and R. C. Pankhurst, 
Ph.D., of the National Physical Laboratory. Communicated by Dr A. C. Aitken, 
F.R.S. 

* (MS. received October 23, 1944. Read May 7, 1945) 


1. Introduction .—The object of this paper is twofold: firstly, to present a table of the 
Chebyshev polynomials C n (x) = 2 cos {n cos'" 1 \x) for n- 1(1)12 and x = 0(0*02)2, values 
being exact or to 10 decimals; secondly, to provide a working list of coefficients and formulae 
relating to these and allied functions. 

Valuable accounts of applications and properties will be found in Van der Pol and Weijers, 
1933, in Lanczos, 1938, and in Szego, 1939. Further applications are indicated in the 
following paper, by J. C. P. Miller, which also suggests methods of reducing the inconvenience 
caused by the present lack of tables of the allied polynomials S n (x). It is hoped that suitable 
tables will be prepared later. 

Manuscript tables of C n {x) and S n (#) at interval o-ooi in x to 12 decimals have been 
prepared by the New York Mathematical Tables Project , under the supervision of Dr A. N. 
Lowan (see Mathematical Tables and other Aids to Computation , 1, No. 4, p. 125, October 
1943), but considerable delay in publication seems probable. Because of this delay and 
because the present tables seem likely to be of particular convenience to computers, expected 
to be in the majority, who need only tabular values, and for whom a fairly wide interval will 
suffice, it has seemed desirable to proceed with publication. 

2.1. Definitions and Notations .—We define the Chebyshev Polynomials * as 

C n (x ) = 2 cos nd where # = Cfx) = 2 cos 9 . (2.11) 


The more usual definition, for which it is also convenient to have a notation, is 
T n ([M) = cos nd where /x = Yff) = cos 9 , 

so that we have 


x = 2[m C n (x) = 2T n (f) 


(2,12) 

(2*13) 


and 

C n (x) = 2 cos (n cos- 1 %x) T n (fi) = cos (n cos- 1 /x). (2.14) 

The revised definition gives simpler numerical coefficients; it has been found more convenient 
in theoretical investigations, and equally good for numerical applications. 

For allied functions we use the notations 


so that 




sin n 9 
sin 9 


a n (x) =o r 1 (#)S n-1 (#) 


a n (x) * 2 sin n 9 = 2V n (/x), j 




(2.15) 


It should be noted that S % _ x (^) and are polynomials of degree n -1 in a; and /x 

respectively, but that a n {x) and v n (fl) are not polynomials. 

The range of C n (x), including x = Cjx), is from -2 to -1- 2, if 9 is to be real, whilst that 
of Y n {p), including /t x = T 1 ( i a), is from -1 to +1. It is convenient on occasion to use instead 
the range o to 1 for the independent variable, which we shall then denote by X, such that 


* To obviate confusion we note that this term has also been applied to several other sets of polynomials; 
for instance, it has been used as synonymous with the term orthogonal polynomials (see Szego, I 939 > P* 2 5 > hi.}, 
although this use now seems largely to have died out. Other more particular polynomials known by the name 
of Chebyshev are (i) the orthogonal polynomials associated with sums over a set of equidistant abscissae, namely 

see Szeg6, 1939, p. 32; (ii) the polynomials whose zeros give the absciss® associated with 
Chebyshev’s formulae for numerical quadrature. 



188 C. W. Jones , J. C . P. Miller, J. F. C. Conn , and R. C. Pankhurst 

2X-1, as in Lanczos, 1938, p. 140, although the advantage of symmetry about the 
origin is then lost. 

For the general range, a < z < b (Lanczos, 1938, p. 137), the definitions are 

sin (» +i)0 


CL = 2TL = 2 cos i 


where 


0 = cos -1 


s„=u n =- 


2 z-b-a 


sm 


b-a 


L (2.21) 


We do not consider this range further apart from a mention in § 3.5. 

2.2. Allied Functions .—If we write <^> = - 0 , so that 2 cos 0 = 2 sin — wc have 

2 sin (2*72 + i)(j> = (- i ) m . 2 cos (202 +1 )0 = (- i)™C 2m _ H1 (#), 

2 sin = (- i ) m+1 .2 sin 2 m 9 -( - i) w+1 o 1 (^)S 2m _i(^) = (- x) w+1 o*2 m (^), 

2 cos (2m + i)<f) = (- i) m . 2 sin (2772 +i )0 = (- i) w cr 1 («;)S 2TO (^) = (- *) m cr 2w 1 iWj 
2 COS 2mcf> = (- l) m . 2 COS 277 Z 0 = (-l) w C 2 w (tf). 

Van der Pol and Weijers, 1933, p. 81, give the same relations in terms of T M (ja) and u w (/x). 
Other polynomials of interest (Szego, 1939, p. 3, etc.) are 

\. 

cos (»+*>?_s nW _ Sn _ iW = u.0*) -U= (- xrV 2 n (Vl~lfi), 


cos _ 
sin (n + i )8 
sin iff 


- S n (x) Hr S M (*) - U n (^) + U^Ou) = U 2n (VX). 


(2.22) 


3. Formulce .—Most of these are readily derived from well-known properties of the 
circular functions. 

3.1. Recurrence Relations . 

J^n+i ~ n "h Jn —i ~ ® Satisfied by C WJ S w , CJ W , ct n , (3*xx) 

2/a may replace 

>n+m - = o Satisfied by C n> T n , S fl , U w> a n , v tt . (3.12) 

2T w may replace C m . 

3.2. Explicit Forms .—These are readily derived from (3.11) starting with the values for 
72=0 and 72 = 1. One set of coefficients suffices for expansions in powers both of X and of 
ju, for 

T n 0 *) « COS nff = cos 272 (| 0 ) = T 2n (cos £ 0 ) = T 2n (VX) 


since 


Thus, for example, 


X = |(i +//,) = cos 2 | 0 . 


Likewise 


and 


T 2 (/i) = 2/x 2 - 1 = 8X 2 - 8X +1 


T 4 (/u) = 8/x 4 -8ju 2 + i. 

«i (f*). 


— hWU«-l(/l) - V 2n (xl X) = 2 ^/x U 2n-l(VX) 


( 3 - 21 ) 

(3-22) 

(3-23) 

(3-24) 


We write 


— 2 x U 2w-i(V X). 

W -V s + ■ U.Oti)=*„ >0 ^-+ ^1-* (3 ' 2S) 


where 


n.(n-p- 1)! 
/!(tz - 2/)! 


(« -/)! 


A 


/ n ~ vn~ 2 p- 1- 


w,22) /K# ~2/)! r « > 2p > o. (3*26) 

w n ,20 = 2 W ~ 2 ^», 23 > J 




Tables of Chebyskev Polynomials 

189 

and all coefficients are zero if 2p > n. Values of some of the coefficients are tabulated below. 
Then, as we have seen, 

Also 

T„( 2 X-i)=4„ (0 X»-4„ )2 X»-i+4 m X«-*- . . . \ 

Un(«X — i) = 2^2n+l,oX n “ 2^2«+l,2X n ~ 1 + — . . . J 

(3-2 7) 


<t 0 (x) = u 0 (/x) = o OjO) = V4 - x 2 u 1 (jx) = V 1 -ja a l 

a n(. x ) =(T l( x )^n—l( x ) J 

( 3 - 28 ) 


^,223 


n\ 2 \fi o 

246 8 xo 12 14 16 

18 20 


0 2 

1 I 

2 1 2 

3 I 3 

4 i 4 


5 

1 

5 

5 








6 

1 

6 

9 

2 







7 

1 

7 

14 

7 







8 

1 

8 

20 

16 

2 






9 

1 

9 

27 

30 

9 






10 

1 

10 

35 

50 

25 

2 





n 

1 

11 

44 

77 

55 

11 





12 

1 

12 

54 

112 

105 

36 

2 




13 

1 

13 

65 

156 

182 

9 i 

13 




14 

1 

14 

77 

210 

294 

196 

49 

2 



15 

1 

IS 

90 

275 ' 

450 

378 

140 

15 



16 

1 

16 

104 

352 

660 

672 

33 b 

64 

2 


17 

1 

17 

119 

442 

935 

1122 

714 

204 

17 


18 

1 

18 

135 

546 

1287 

1782 

1386 

540 

81 

2 

19 

1 

19 

152 

665 

1729 

2717 

2508 

1254 

285 

19 

20 

1 

20 

170 

800 

2275 

4004 

4290 

2640 

825 

100 













n \ 2 \p 

0 

0 

4 

6 

8 

IO 

12 

14 

16 

18 

0 

1 










1 

1 










2 

1 

I 









3 

1 

2 









4 

1 

3 

I 








5 

1 

4 

3 








6 

1 

5 

6 

1 







7 

1 

6 

10 

4 







8 

1 

7 

15 

10 

1 






9 

1 

8 

21 

20 

5 






10 

I 

9 

28 

35 

15 

I 





11 

i 

10 

3 6 

56 

35 

6 





12 

1 

11 

45 

84 

70 

21 

I 




13 

i 

12 

55 

120 

126 

56 

7 




14 

I 

13 

66 

165 

210 * 

126 

28 

1 



15 

1 

H 

78 

220 

330 

252 

84 

8 



16 

I 

15 

9 i 

286 

495 

462 

210 

36 

1 


17 

I 

16 

105 

364 

715 

792 

462 

120 

9 


18 

I 

17 

120 

455 

1001 

1287 

924 

330 

45 

1 

19 

I 

18 

136 

560 

1365 

2002 

1716 

792 

165 

10 

20 

I 

19 

153 

680 

1820 

3003 

3003 

1716 

495 

55 



190 


C. W. Jones , /. C. P. Miller , /. i? 67 . 'Conn, ain/ j?. C. Pankhurst 


^n,2 a? 


«\2/ 0 

O I 

2 

4 

6 

8 

10 

x± 

14 

16 

H 

CO 

0 

1 

2 

1 

2 

1 









3 

4 ' 

3 









4 

8 

8 

1 








5 

16 

20 

5 








6 

32 

48 

18 

, 1 







7 

64 

112 

56 

7 







8 

128 

256 

160 

32 

1 






9 

256 

576 

432 

120 

9 






IO 

512 

1280 

1120 

400 

So 

1 





11 

1024 

2816 

2816 

1232 

220 

11 





12 

2048 

6144 

6912 

3584 

840 

72 

I 




13 

4096 

13312 

16640 

9984 

2912 

364 

13 




14 

8192 

28672 

39424 

26880 

9408 

1568 

98 

1 



15 

16384 

61440 

92160 

70400 

28800 

6048 

560 

IS 



16 

32768 

1 31072 

2 12992 

1 80224 

84480 

21504 

2688 

128 

1 


•17 

65536 

2 78528 

4 87424 

4 52608 

2 39360 

71808 

11424 

816 

17 


18 

1 31072 

5 89824 

11 05920 

11 18208 

6 58944 

2 28096 

44352 

4320 

162 

I 

19 

2 62144 

12 45184 

24 90368 

27 23840 

17 70496 

695552 

1 60512 

20064 

1140 

19 

20 

5 24288 

26 21440 

55 70560 

65 53600 

46 59200 

20 50048 

5 49120 

84480 

6600 

200 I 







u n 9 2z> 






n \2jt> 

0 

0 

1 

2 

4 

6 

8 

10 

12 

14 

16 

18 

1 

2 

2 

4 

1 









3 

8 

4 









4 

16 

12 

1 








S 

32 

32 

6 








6 

64 

80 

24 

1 







7 

128 

192 

80 

8 







8 

256 

448 

240 

40 

I 






9 

512 

1024 

672 

160 

,10 






10 

1024 

2304 

1792 

560 

60 

1 





11 

2048 

5120 

4608 

1792 

280 

12 





12 

4096 

11264 

11520 

5376 

1120 

84 

1 




13 

8192 

24576 

28160 

15360 

4032 

448 

14 




14 

16384 

53248 

67584 ■ 

42240 

13440 

2016 

112 

1 



IS 

32768 

I 14688 

1 59744 

1 12640 

42240 

8064 

672 

16 



16 

65536 

2 45760 

3 72736 

2 92864 

I 26720 

29568 

336 o 

144 

1 


17 

1 31072 

5 24288 

8 60160 

7 45472 

3 66080 

1 01376 

14784 

960 

18 


18 

2 62144 

11 14112 

19 66080 

18 63680 

10 25024 

3 29472 

59136 

5280 

180 

1 

19 

5 24288 

23 59296 

44 56448 

45 87520 

27 95520 

10 25024 

2 19648 

25344 

1320 

20 

20 

10 48576 

49 80736 

100 27008 

hi 41120 

74 54720 

30 75072 

7 68768 

1 09824 

7920 

220 


3.3. Inverse Relations .—It is sometimes useful to express a given power series as the 
sum of a series of Chebyshev polynomials; the following inverse relations may then be used 
{cf. Lanczos, 1938, p, 147). 


X* m ~Cz m (x)+2mC 2m_ 2 (#) + ( 2 )c 2m _ 4 (^) + * * • )C 2 w _22?(^) + • • • 

+*(r)c« 

. .. +(“ + , )c»W) + ... 



Tables of Chebyshev Polynomials 


and 


191 


- T„ W +* ( 7 ) T *-<M + • ■ • + ( 7 ) t «W+ • ■ 

/ \ r (’^•'^ 2 ^ 

22 V m+1 = T 2m+l(M) +( 2 « + l)T 2m _i(/x) + . . . +( 2 ^ +I )T 2m _ 2jH . l ( j U,)+ ... 

+C” + ') T >« 

Likewise, dropping the argument (2X-1) in T r (2X -1), we have 


«-iX- = T m + 2OT T m _ 1 + ( 2 f)T m _ 2+ . . . +( 2 f)T m _,+ . . . +i( 2 ^)t 0 (3. 


33 ) 


The coefficients of the successive terms of the expansions of equations (3.31) are tabulated 
below. 


Co C x 


1 

x 

X* 


X 5 

X 9 

^0 

tf 11 


4 

I 

3 

IO 

35 

126 

462 


1 

3 

10 

35 

126 

462 


1 

4 

15 

56 

210 

792 


1 

5 

21 

84 

330 


1 

6 

28 

120 

495 


1 

7 

36 

165 


1 

8 

45 

220 


1 

9 

55 


r 

10 

66 


i 

11 


1 

12 


This table may be easily extended by noting that each entry may be obtained from the adjacent 
items in the preceding line. 


3.4. Differential Equations . 


(Py dy 

^~ x) d^- x Tx +ny=0 
d 2 y dy 

(i -^V 2- ^ + ^ =0 

d % y dy 


Solution: y = aCJx) + dcr n (pc) 
Solution: y = aT„(ju.) + fiv n (jM) 


( 34 1) 

( 342 ) 


3.5. Orthogonality . 

C ro («)C B («)- .—-^- =0 

J -2 A/4-^ 2 

f 2 dx 

L CM ir*- m 

f TJWTJj.H 4 * 


VI -u 2 




0 

li 

w 

Solution: y = a'S n _ 


(343) 

•i)jv = 0 

Solution: y = a'U n 


(344) 

/*2 

J ^S m («)S n (x)V 4 -* 2 &=o 

m^n 

(3-51) 


-X 2 dx=*27T 

n > 1 

(3-52) 

| VMV^Wi-ffdn-o 

m^n 

(3-53) 



192 


C. W. Jones, J. C. P. Miller, J. F. C. Conn, and R. C. Pankhurst 


f 
f 

\a TmT W( Z -a)( 3 ~ Z ) 

f 6 , dz 

T n ' 

J a 


dX 

T T -- 

"*V 3 MP 

_ 0 dX 
T 2 —====== ==1 

dz 


j \J n 2 ([i)V I -fPdfi = W 
j 1 u m u n Vx^Tvx=o 
J*u n 2 Vx-xvx=&xr 
fu m U y(z-aj(J^ 7 )dg = o 

Ja 

f U n 2 V(z - a){b - z)dz = £7 r(b - <z) 2 
J a 


n > i 

m^n 

n> i 

m^n 

n> x 


( 3 - 54 ) 

( 3 -SS) 

( 3 - 56 ) 

( 3 - 57 ) 

( 3 - 58 ) 


V(z~ a)(b - z) 

For brevity, the appropriate argument {i = 2X - 1 =(22 —b -d)l(b -a) has been omitted from 
Tj,, Uj, in ( 3 . 5 SH 3 - 58 ). 

3.6. Generating Functions . 


2 -xt 

^C n (x)^ = i -—— 2 


71 = 0 


* I — U>t 

2T n (fx)/ “1 -zpt+t* 


Xs n (^_ t _ x/+> 

| 0 U ^=^ 


y,C n (*/- = 2^”* COS (|n/ 4-X 2 ) 


71 = 0 


2 V4 - * 2 S„_ 1 (a:)—, = 2e ! - a ' 1 sin (llV 4 - a: 2 ) 


71 = 0 


2, T »^)~i cos ^ 1 -/**) 


n=0 


/» 


2 , V1 -M a U„_i(/x)- sin (A/ x -fx 2 ) 


( 3 - 6 i) 

(3-62) 

(3-63) 

(3-64) 


71 = 0 


The connection with Jacobi polynomials (Szego, 1939, pp. 59, 68) gives a third set of generating 
functions: 


“^2.4.6... 

A i- 3 - 5 --- 
“J2.4.6... 


2 — /as4 -2a/ x -/x + / 2 l ^ 

I-/tf+/ 2 J 

,( 273 -fl) I f — 2 tX + 2 V^I — 

.(2^ + 2) %Xt ~~tV4-X 2 \ 1-tX+t* 

= {(l -to +* 2 )(2 — tX 4 * 2 A /1 ~£c+/ 2 )}~* 




.(272 - i) 


.272 


C n (x)t n 


4 


(3-65) 


with similar expressions involving T n} XJ n and /x. 

/ 3.7. Further Relationships .—Since the tables give only C n (x), the following relations 
may be found useful. We have (omitting the argument ( x )) 


whence 

c„ — s n S n _2 dCJdx — wS n _j 

®2»-l = Qn-J + C 2n _ 3 + . . . + C x ) 

( 3 - 7 i) 

and 

f 

®2»i — ^2n + C2«_2 + • • • 4 - C2 + I J 

( 3 - 72 ) 

and dCJdx follows. 

Also 

dS n ^ 1 d 2 C n 



dx ~ n dx 2 _ ~ n< ^n)l (4 ~ x 2 ). 

( 3 * 73 ) 


4. Calculation of the Tables. 


. 4 -I- The functions C„(a) were computed by C. W. J. and J. C. P. M. to 14 decimals, 
using (3.11), for each particular value of x s and checked by (3.12) with m — n, i.e. C %n = C w 2 - 2. 
Typed tables, suitable for printer’s copy, were then prepared giving. C n (x) for n = 1(1)12, 
x=0(0*02) 2; exact values are given for n < 7 and 10-decimal values for ^ > 8. 



Tables of Chebyshev Polynomials 193 

All typed values, together with the extra MS. digits, have been checked by formation of 
selected differences. For n < 7 every value was used in the formation of at least one value 
of S 10 which was required to be exactly zero; every tenth such difference was evaluated. 
For n > 8 (where exact values were not calculated by C. W. J. and J. C. P. M.), selected 
values of § 10 were similarly evaluated, and the results recorded; further, the four final digits 
(nth to 14th decimals) were used to give other values of S 10 , one half-way between each pair 
of values previously obtained. For n < 9, these differences should all be close to zero, and 
for n > 10 they should be close to values calculated in advance; agreement was satisfactory 
in all cases. 

4.2. Values of the polynomials were obtained independently by J. F. C. C. and R. C. P., 
all being exact up to and including C 10 . For each of the lower degree polynomials a complete 
difference table was built up, starting from constant S n ; for the higher degree polynomials 
to = 10, (3.12) was used, with various values of m, and the results checked by formation of 
selected values of S n , as explained above. In each case T n was found and the results doubled 
to give C n . Finally all entries for # = o(o*i)2 were checked by direct evaluation of C n . For 
n = 11 and 12, values were found to 12 decimals only, again by using (3.12) to give T u and 
T 12 , and then doubling the results. 

4.3. The two independent sets of values were compared with each other as an additional 
check (although the only check on the 12-decimal values of C n and C 12 ); no discrepancies 
were found. 

Typed copies of the tables, giving exact values of C n (x) for n <10 have been presented 
to the Royal Society of Edinburgh and to the British Association Committee on the 
Calculation of Mathematical Tables. 

5. Values for Non-Tabular Arguments. —No provision has been made for interpolation. 
This is because it seems likely that tabular arguments will usually suffice. Again, if this is 
not the case, it seems almost certain that values of C n (x), when required, will be needed for 
several different values of for each individual value of x used, i.e. that interpolation at the 
same argument in several columns would be necessary. In these circumstances, it is better 
to use (3.11), starting with C 0 = 2, C 1 = 5c, than to interpolate, checking finally by means of 
C 2n = C n 2 - 2. An alternative check is to obtain the final value by interpolation from the 
present tables; formulae of the Lagrange type seem most appropriate for such infrequent 
calculations. Yet another possibility is to evaluate f^cos -1 Jtf^tan” 1 ( xjv , 4 -5c 2 ) and from 
it to obtain 2 cos n 6 for the required value of n. 

Isolated values may also be found in the same way, if needed, but it seems useful to indicate 
how the recurrence formulae may be combined to cut out intermediate steps. Thus 

C 2 =a; 2 -2, C 4 = C 2 2 -2, C 8 = C 4 2 - 2 

gives C 8 in three stages; 

C 2 =* 2 - 2, C 3 =*C 2 - Q, C 6 - C 3 2 - 2, C 9 = C 3 (C 6 -1) 

gives C fl in 4 stages. The processes are not ideal, but may often be less troublesome than 
interpolation. 

In all calculations of this type, mental interpolation in the tables gives a useful check 
against gross errors, such as errors of sign. , 

6. Acknowledgment. —The writers wish to express their thanks to Dr L. J. Comrie for 
help in the design and preparation of the diagrams and for verifying the last n digits of 
the exact values of C 7 to C 10 by differencing them on a National accounting machine. 


REFERENCES TO LITERATURE 

LanczOS, C., 1938. “Trigonometric Interpolation of Empirical and Analytical Functions”, Journ. 

of Math, and Phys. t XVII, 123-199. 

.Szego, G,, 1939. Orthogonal Polynomials, Amer. Math. Soc., New York. 

Van DER Pol, B., and Weijers, T. J., 1933. “Tchebycheff Polynomials and their Relation to 
Circular Functions, Besselfunctions and Lissajous-Figures”, Physica , 1, No. i, 78-96. 























0Z+ 81+ 91+ M+ Z-\+ 01+ 8 0+ 9*0+ fr*0+ Z- 0+ 0 10- fr-O- 9 0- 8-0- 0*1- Z- 1- M- 91- 81- 0Z- 




rq6 C. W. Jones , J. C. P. Miller , /. < 7 . R. C. Panklmrst 

Chebvshev Polynomials 


X 

/* 

X 

Cl 

c 2 

c 3 

C4 


1 -X 

0*00 

*02 

*04 

•06 

•08 

0*00 

•01 

*02 

*03 

*04 

0*500 

*505 

•510 

*515 

*520 

0*00 

■02 

*04 

*06 

*08 

-2-0000 

1-9996 

•9984 

•9964 

•9936 

- 0*00000 0 
•05999 2 
•II9936 
•17978 4 
•23948 8 

+ 2*00000 000 
1*99840016 
*99360 256 
•9856I 296 
•97444 096 

+ o-ooooo 00000 
•09996 00032 
*1996801024 
*29892 07776 
•3974432768 

0*500 

*495 

*490 

■485 

•480 

0*10 

•12 

*14 

•16 

•18 

0*05 

•06 

•07 

•08 

*09 

0-525 

*530 

*535 

•540 

*545 

0*10 

*12 

*14 

•16 

•l8 

- I-9900 

•9856 

•9804 

*9744 

•9676 

- 0-29900 0 
•35827 2 
•417256 
•475904 
•534168 

-HI-96010 000 

•94260 736 
*92198416 
•89825 536 
•87144976 

+0-49501 00000 
•59138 4SS32 
•68633 37824 
•77962 4S576 
•87102 89568 

0-475 

•470 

■465 

•46O 

*455 

0*20 

•22 

•24 

*26 

•28 

0*10 

•11 

•12 

*13 

*14 

0-550 

*555 

•560 

*565 

*570 

0*20 

*22 

•24 

*26 

*28 

-1 *9600 
•9516 
•9424 
•9324 
•9216 

-0*592000 
•64935 2 
•706176 
•762424 
*818048 

+1*84160 000 
*80874 256 
*77291 776 
■73416976 
•69254 656 

+ 0*9603200000 
1*64727 53632 
*13167 62624 
•2133081376 
•29196 10368 

0-450 

*445 

•440 

*435 

•430 

0*30 

.32 

*34 

* 3 o 

*38 

0-15 

•16 

•17 

•18 

•19 

o*575 

•580 

•585 

•590 

*595 

0*30 

•32 

*34 

.36 

•38 

-1-9100 
•8976 
•8844 
•8704 

•8556 

-0*873000 
0*92723 2 
0*98069 6 
1*03334 4 
1*085128 

+ 1*64810000 
•60088 576 
■55096 336 
•49839616 
•44325136 

+1-36743 00000 

•4395154432 

•50802 35424 
•57276 66176 
‘6335635168 

0-425 

•420 

*4X5 

•410 

'405 

0*40 

*42 

*44 

•46 

.48 

0*20 

•21 

•22 

•23 

•24 

0*600 

*605 

•610 

*615 

*620 

0*40 

•42 

•44 

*46 

*48 

- 1*8400 
•8236 
*8064 

•7884 

•7696 

-1*136000 
•185912 
•234816 
•282664 
•329408 

+ 1-38560000 
•32551696 
•26308 096 
• 19837 456 

•13148416 

+1-6902400000 
*7426291232 
•7905716224 
•8339162976 
•87252 03968 

0*400 

*395 
•390 
-385 
s *380 

0*50 

*52 

*54 

’5* 

•58 

0-25 

•26 

•27 

•28 

•29 

0*625 

•630 

•635 

*640 

•645 

0*50 

*52 

*54 

*5* 

*58 

-1*7500 
*7296 
•7084 
•6864 
•6636 

-1-375000 
•41939 2 
•462536 
*5043*4 
*54488 8 

+1-06250000 
0*99151 616 
•91863 056 
•84394496 

•76756 496 

+1*90625 00000 
*93498 04032 

•9585965024 

•9769931776 
*99007 56768 

0375 

*370 

•3<55 

•360 

*355 

0*60 

•62 

•64 

•66 

•68 

0-30 

*31 

•32 

*33 

*34 

0*650 

*655 

•660 

*665 

*670 

o*6o 

•62 

•64 

•66 

•68 

-1*6400 

•6156 

•5904 

•5644 

•5376 

-1-584000 
•62167 2 

•65785 6 

•692504 

•725568 

+0*68960000 
*61016336 ! 
•52937 216 
*44734 736 
•36421 376 

+1*99776 00000 
•99997 32832 
*99665 41824 

•98775 32576 
•97323 33568 

0*350 

*345 

*340 

*335 

*330 

0*70 

*72 

*74 

•76 

*78 

o*35 

.36 

*37 

*38 

*39 

°-67 s 

*680 

*685 

*690 

•695 

0*70 

•72 

*74 

•76 

*78 

-1*5100 

•4816 

•4524 

•4224 

*3916 

-i*757ooo 
•78675 2 
•81477 6 
*841024 
'865448 

+ 0*28010000 
•19513856 
•10946576 
+ *02322 176 

- *06344944 

+1*95307 00000 
•92725 17632 
•8957806624 
•85867 25376 
•81595 74368 

o*325 

•320 

•3X5 

*3x0 

*305 

* o*8o 
*82 
•84 
•86 
•88 

0*40 

*41 

•42 

*43 

*44 

0*700 

•705 

•710 

•715 

•720 

0*80 

*82 

•84 

•86 

•88 

-1*3600 
•3276 
•2944 
•2604 
•2256 

-1*888000 
•90863 2 
*92729 6 

*94394 4 
•95852 8 

-0*15040000 
*23747 824 
•32452864 
*41139184 
•49790464 

+ 1-76768 00000 
•71389 98432 
•65469 19424 
*5901470176 
*5203719168 

0*300 

*295 

*290 

•285 

•280 

0*90 

■92 

*94 

*96 

0*98 

0*45 

‘46 

*47 

*48 

*49 

0-725 

*730 

*735 

*740 

*745 

0*90 

*92 

*94 

.96 

0*98 

-1*1900 

•1536 

*1164 

*0784 

•6396 

-1*971000 
•981312 
•989416 
•995264 
1*998808 

-0*58390 000 
•66920 704 
*75365 104 
•83705 344 
0-91923 184 

+1*44549 00000 
•3656415232 
*28098 40224 
•19169 26976 
*09796 07968 

0*275 

•270 

•265 

*260 

•255 

1*00 

0-50 

0*750 - 

1*00 

- 1*0000 

- 2*00000 0 

- I-00000 000 

+1*00000 00000 

0*250 

~X 

“A* 

- x 

-Ci 

1 

H-C 2 

-c. 

+ C 4 

1 - 

x 


\x—p = 2X - I = COS 0 C 0 = 2 = 2T 0 C n = 2 cos «0 = 2T, 


X=cos 2 id 


i -X^sin 2 \Q 





Tables of Chebyshev Polynomials 

Chebyshev Polynomials 


197 


X 


X 

Ci 

c 2 

C 3 

c 4 

c 5 

i-X 

1-00 

0-50 

o*75° 

1-00 

- I-oooo 

- 2-00000 0 

- I-ooooo 000 

+ 1*00000 OOOOO 

0*250 

•02 

•51 

*755 

•02 

0-9596 

1-998792 

•07916784 

0*89804 08032 

*245 

•04 

.52 

•760 

•04 

•9184 

•995136 

•15654144 

•79233 29024 

•240 

•06 

•53 

■765 

•06 

•8764 

•98898 4 

•23192 304 

•6831455776 

*235 

•08 

•54 

■770 

•08 

•8336 

•98028 8 

•30511104 

- *5707680768 

*230 

rio 

o*55 

0 -7’S 

1*10 

- 0-7900 

-1-969000 

-1-37590000 

+ 0*45551 ooooo 

0*225 

•12 

*56 

•7'SO 

•12 

■7456 

•95507 2 

•44408 064 

*3377016832 

•220 

•14 

’57 

•78s 

*14 

•7004 

•93845 6 

•50943984 

•2176945824 

-215 

•16 

•ss 

•790 

•16 

•6544 

•9I9I04 

•57176 064 

+ *0958616576 

•210 

•iS 

*59 

•795 

•iS 

•6076 

•89696 8 

•63082 224 

- *0274022432 

•205 

1*20 

o*6o 

o-8oo 

1*20 

-0*5600 

-1-87200 0 

-1-68640 000 

-0-1516800000 

0*200 

•22 

•61 

•805 

•22 

•5116 

•844152 

•73826 544 

•2765318368 

-195 

•24 

•62 

•810 

•24 

•4624 

•81337 6 

•78618624 

•4014949376 

-190 

•26 

’63 

•815 

•26 

•4124 

•77962 4 

•82992 624 

•52608 30624 

•185 

•28 

•64 

•820 

■28 

•3616 

•74284 8 

•86924544 

•64978 61632 

-l80 

1*30 

0-65 

0-825 

1-30 

-0-3100 

-1-70300 0 

-1-90390000 

-0-7720700000 

0-175 

•32 

•66 

•830 

*32 

•2576 

•66003 2 

•93364224 

0-89237 57568 

*170 

*34 

•67 

*835 

*34 

•2044 

•61389 6 

•95822 064 

i-oioii 96576 

■^5 

*3 6 

•68 

•840 

■36 

•1504 

•564544 

•97737 984 

•1246925824 

*l60 

•38 

•69 

•84s 

•38 

•0956 

•511928 

•99086 064 

•23545 96832 

* r 55 

1-40 

0*70 

0-850 

1*40 

- 0-0400 

-1-456000 

-1-99840000 

-1-3417600000 

0-150 

•42 

•71 

-855 

*42 

+ -0164 

•396712 

•99973104 

*44290 60768 

*145 

•44 

•72 

•860 

•44 

•0736 

•33401 6 

•99458 304 

•5381835776 

•140 

•46 

*73 

•865 

■46 

*1316 

■267864 

•98268 144 

•62685 09024 

*135 

•48 

*74 

•870 

•48 

•1904 

•19820 8 

•96374784 

•70813 88032 

•130 

1-50 

o*75 

0*875 

1*50 

+0-2500 

- I-I250O0 

-* 1 *93750 000 

-1-7812500000 

0-125 

'52 

•76 

•880 

* 5 2 

•3104 

1-04819 2 

*90365184 

•84535 87968 

■120 

’54 

*77 

•885 

*54 

•3716 

0-96773 6 

•86191 344 

•8996106976 

•115 

* 5 6 

•78 

•890 

*56 

•4336 

•883584 

•81199 104 

•94312 20224 

•no 

•58 

*79 

•895 

*58 

■4964 

•79568 8 

*75358 704 

*97497 9S23 2 

•105 

i-6o 

o-8o 

0-900 

i-6o 

+0*5600 

- 0-70400 0 

-1-68640 000 

-1*9942400000 

o-ioo 

•62 

•Si 

•905 

•62 

■6244 

•60847 2 

•61012464 

•99992 99168 

*095 

•64 

•82 

•910 

•64 

•6896 

•50905 6 

•52445 184 

-9910450176 

•090 

•66 

•83 

■915 

•66 

•7556 

•405704 

•42906 864 

*96654 99424 

•085 

•68 

•84 

•920 

•68 

•8224 

•298368 

•32365 824 

*92537 78432 

•080 

170 

0*85 

0-925 

1-70 

+ 0-8900 

-0-187000 

-1-20790000 

-1*86643 ooooo 

0-075 

•72 

•86 

•930 

•72 

0-9584 

- -071552 

1-08146944 

*78357 54368 

•070 

‘74 

•87 

*935 

■74 

1-0276 

+ *048024 

0-94403 824 

•69065 05376 

•065 

•76 

•88 

•940 

•76 

•0976 

•171776 

•79527 424 

•57145 86624 

•060 

•78 

•89 

*945 

•78 

•1684 

•29975 2 

•63484 144 

•42976 97632 

*055 

i*8o 

0*90 

0-950 

i-8o 

+ 1*2400 

+ 0-43200 0 

- 0-46240 000 

-1-26432 ooooo 

0-050 

•82 

•91 

*955 

•82 

•3124 

•568568 

•27760 624 

1-0738113568 

*045 

•84 

•92 

•960 

*84 

•3856 

•70950 4 

- -08011264 

0*8569112576 

•040 

•86 

*93 

•965 

•86 

•4596 

0-85485 6 

+ -13043216 

•61225 21824 

*035 

•SS 

’94 

•970 

•88 

•5344 

1-00467 2 

*35438 336 

•3384312832 

•030 

1*90 

o*05 

0*975 

1*90 

+1-6100 

+1-159000 

+ 0-59210000 

-0*03401 ooooo 

0*025 

*02 

•06 

•980 

* -92 

•6864 

•317888 

0-84394496 

+0-3024863232 

•020 

77"* 

■94 

*97 

•985 

*94 

•7636 

•481384 

1-11028 496 

0*67256 88224 

*015 

•96 

■98 

•990 

•96 

•8416 

•64953 6 

1*39149056 

1*07778 54976 

•010 

1-98 

o*99 

o*995 

1-98 

1-9204 

1-822392 

1-68793 616 

i*5 x 972 15968 

•005 

2*00 

I *00 

I *000 

2*00 

+ 2-0000 

+ 2*00000 O 

+ 2-00000 OOOC 

) + 2-00000 ooooo 

0*000 



i-X 

-Ci 

+ Cg 

-c 8 

+c 4 

.... 

-c. 

X 


lx =! X - 2X-I=cos0 C 0 =2=2T 0 C„=2cosk0=2T„ X=cos a J0 I-X-Si H ! i0 


14 


P.R.S.E.—VOL. LXII, A, 1945-46, PART II 




198 


C. W. Jones, J. C. P. Miller, J. F. C. Conn, and R. C. Pankkurst 

Chebyshev Polynomials 


X 

!* 

X 

C„ 

c 7 

C 8 

i-X 

0*00 

0*00 

0*500 

- 2*00000 00000 00 

- 0*00000 00000 0000 

+ 2*00000 00000 

0*500 

*02 

•01 

* 5°5 

1*99640 O9599 36 

•13988 80223 9872 

1-9936031995 

*495 

*04 

*02 

.510 

•985615355904 

•2791047166 3616 

•9744511672 

•490 

*06 

'03 

• 5 i 5 

•96767 7713344 

*41698 14404 0064 

•94265 88269 

•485 

*08 

•04 

•520 

•9426454978 s 6 

•55285491662848 

•8984171045 

•480 

o-io 

0*05 

0*525 

-1*91059 90000 00 

- o* 68606 99000 0000 

+ 1*84199 20100 

0-475 

*12 

•06 

•530 

•87164 x 1740 16 

0*81598 182408192 

•77372 33551 

• 47 ° 

*14 

*07 

•535 

■82589 74304 64 

0*94195 94226 6496 

•694O23III3 

•465 

*l6 

*08 

•540 

• 7735 1 53827 84 

1-0633873188 4544 

•6033734118 

•46O 

*l8 

*09 

•545 

■7146645477 76 

1-1796685753 9968 

•50232 42042 

*455 

0*20 

o-io 

0-550 

-1*64953 60000 00 

-1*29022 720000000 

-l“I* 39 I 49 05600 

0*450 

*22 

•11 

*555 

•5783419800 96 

•39451059882112 

•2715496484 

*445 

*24 

*12 

•560 

•501315457024 

•4919919720 8576 

■14323 73S37 

•440 

•26 

•13 

*565 

•41870 96442 24 

■58217264509824 

1-0073447565 

*435 

*28 

*14 

•570 

•33079 74696 96 

■66458 432831488 

0-8647138578 

*430 

0*30 

0*15 

Q -575 

-1*23787 10000 00 

-1*73879 130000000 

+ 0*71623 36XOO 

0-425 

*32 ] 

•16 

•580 

1*1402408181 76 

•80439 250501632 

*56283 52166 

*420 

*34 | 

*17 

•585 

1-03823 S3S5S 84 

•86102 35632 9856 

•40548 73441 

•415 

*36 

•18 

•590 

0*9322001776 64 

•90835 86815 5904 

•24519 IO523 

•410 

•38 

*19 

*595 

0*82249 72236 l6 

•9461 x 24617 7408 

+ *0829744881 

•405 

0*40 

0*20 

o*6oo 

-0*70950 40000 00 

-1*97404 16000 0000 

- 0 * 080 II 264OO 

0*400 

•42 

*21 

•605 

■59361 27282 56 

•9919464690 6752 

■243OO 47888 

■395 

*44 

*22 

•610 

•47S22 94461 44 

■99967 25787 0336 

*40462 64885 

| *390 

*46 


■6 IS 

•35477 30631 04 

•9971119066 2784 

•5638984139 

■385 

*48 

•24 

•620 

•23267 43695 36 

•98420 40941 7728 

• 7*974 35957 

*380 

0*50 

0*25 

0*625 

-0*10937 5000000 

-1*96093 75000 0000 

- 0 - 87 I 09 37500 

o *375 

•52 

*26 

‘63O 

+ *01467 36496 64 

•9273501053 7472 

r*0X689 57045 

•370 

•54 

•27 

•635 

*13901 1551296 

•88353 02647 0016 

‘X 56 XI 78942 

•365 

*56 

*28 

•640 

•26317 12194 56 

•82961 72947 0464 

*28775 69045 

*360 

*58 

•29 

■645 

*38667 89325 44 

•7658018959 2448 

•4IO84 40322 

*355 

0*60 

0*30 

0*650 

+ 0*50905 60000 00 

-1*69232 64000 0000 

- x *52445 18400 

0*350 

*62 

•31 

•655 

•62982 00755 84 

•60948 48363 3792 

*62770 06741 

*345 

*64 

.32 

*660 

•74848 65167 36 

■51762 281168896 

•7x97651162 

*340 

•66 

*33 

■665 

*8645697900 16 

*41713719618944 

•79988 03395 

*335 

■68 

*34 

*670 

0*9775849226 24 

•30847 560941568 

*8673483370 

*330 

0*70 

o* 3 S 

0*675 

+ 1*087049000000 

-1*192x3570000000 

-1*9215439900 

0*325 

*72 

‘36 

‘68O 

*19248 27095 04 

x *06866 42123 5712 

•9619209424 

•320 

74 

*37 

*685 

*2934119301 76 

0*93865 583406976 

•98801 72474 

*315 

*76 

*38 

*690 

*389369368576 

•80275 181748224 

•9994607499 

•310 

*78 

*39 

•695 

•47989 62407 04 

*66163 83690 5088 

•9959741686 

*305 

o-8o 

0-40 

0*700 

+ 1*5645440000 00 

-0*51604 48000 0000 

-1*9773798400 

0*300 

*82 

• 4 i 

' 70 S 

•642876111424 

•36674143183232 

*9436040855 

•295 

•84 

•42 

*710 

•71446 98716 16 

•2145372502 4256 

*8946811618 

•290 

•86 

•43 

*715 

•778918275136 

- *06027730098304 

•83075 67540 

•285 

*88 

•44 

•720 

*8358319267 84 

+ *095160x7876992 

*7 520909695 

•280 

0*90 

o *45 

0*725 

+1*884841000000 

+0*25086 69000 0000 

-1*65906 07900 

0*275 

•92 

*46 

*730 

*925597241344 

•4059079388 3648 

•5521619376 

, *270 

*94 

*47 

*735 

•95777 60210 56 

*55932 54373 9264 

•4320101099 

•265 

•96 

*48 

•740 

•98107 84296 96 

•7101425949 08x6 

•2993415386 

•260 

0*98 

*49 

*745 

I ’ 99 S 2 3 34208 64 

0*85736795564672 

•15501 28243 

*255 

i-oo 

0-50 

0*750 

+ 2*00000 00000 00 

+1 *00000 00000 0000 

- 1*00000 00000 

0*250 

-x 

-jw 

i-X 

+c» 

-c, 

+c 8 

X 


2 X-i=cose C n =2cos»e=2T B X=cos s J 0 i-X=sin i j 


Tables of Chebyshev Polynomials 

Chebyshev Polynomials 


199 


X 


X 

C 6 

C, 

Q 

i -X 

1*00 

0*50 

0-75° 

+ 2*00000 00000 00 

+ 1*00000 00000 0000 

- 1*00000 00000 

0*250 

•02 

•5i 

*755 

1-995169459264 

•13703 204524928 

0-8353967731 

•245 

•04 

*52 

•760 

•9805676584 96 

•26745 746243584 

•66241 18976 

•240 

•06 

*53 

•765 

*95605 73522 56 

•39027521579136 

•48236 56235 

•235 

•08 

*54 

•770 

•921540562944 

■50449 573“ 7952 

•2966851733 

*230 

I'lO 

o*55 

0775 

+1*87696 10000 00 

+1*6091471000 0000 

-0-1068991900 

0*225 

•12 

•56 

•780 

•82230 65251 84 

•70328 162500608 

+ *0853688948 

•220 

•14 

*57 

■785 

•757*6116639 36 

•78598271448704 

•27840 86306 

*215 

•l6 

•58 

•790 

*68296 01628 16 

•85637213126656 

•470431509s 

•210 

•l8 

*59 

■795 

•59848 75930 24 

•91361 76029 6832 

•6595811785 

•205 

1-20 

o*6o 

o-8oo 

+1*50438 40000 00 

+ 1 *95^94 08000 0000 

+0*8439449600 

0*200 

•22 

•6i 

•805 

•4008965991 04 

•98562 56877 0688 

1*02156 67399 

•195 

•24 

•62 

•810 

•28833 25173 76 

•99902725914624 

*1904612840 

*190 

•26 

•63 

•815 

•16706 15813 76 

•99658065493376 

•34863 00438 

•185 

•28 

•64 

*820 

1-03751 9 I 5 II °4 

*97781 06766 1312 

•4940785150 

*180 

1*30 

0*65 

0-825 

+ 0*90020 90000 00 

+1-94234 17000 0000 

+ 1-62483 52100 

0*175 

.32 

*66 

•830 

•755706241024 

•88990 799495168 

■73897 2 3 I2 3 

*170 

*34 

•67 

•835 

•60466 02988 16 

*82036445801344 

•83462 80749 

*165 

•36 

*68 

•840 

■4477979279 36 

•73369 77643 9296 

•91003 10316 

*160 

•38 

*69 

•845 

•28592 6277184 

•63003 79457 1392 

•96352 60879 

*155 

1*40 

0*70 

0*850 

+ 0*119936000000 

+1-50967 04000 0000 

+ 1*9936025600 

0*150 

•42 

•71 

•855 

- -049195589056 

•37304834034048 

•99892 42323 

*145 

*44 

*72 

•860 

•220401311744 

•22080 56886 8864 

•9783615035 

•140 

*46 ! 

*73 

•865 

•39252 08775 04 

1-05377042124416 

•93102 56925 

•135 

•48 

*74 

•870 

•5642975887 36 

0-87297 837187072 

•8563055791 

•130 

1*50 

o*75 

0-875 

-0*73437 5°o°° 00 

+0-67968 7 5000 0000 

+ 1*7539062500 

0*125 

•52 

•76 

•880 

0*9012935311 36 

•47539 262947328 

•6238903279 

■120 

*54 

*77 

•885 

1-06348 70343 04 

•26184 066477184 

*4667216581 

*115 

•56 

•78 

■890 

•219279314944 

+ *04104629108736 

*2833115290 

•no 

.58 

*79 

•89s 

•36688 06066 56 

- -18469 18353 1648 

1*0750675069 

*105 

1*60 

o*8o 

0*900 

-1*5043840000 00 

- 0*41277 44000 0000 

+0*8439449600 

o-ioo 

•62 

•81 

•905 

•6297618252 16 

0-64028 42400 4992 

*5925013563 

*095 

•64 

*82 

*910 

•7408619888 64 

0-86396 86441 3696 

•32395 34x25 

*090 

*66 

*83 

*9i5 

•8354042643 84 

I-08022113647744 

+ *0422371778 

•085 

*68 

•84 

*920 

•91097 65365 76 

1-2850627382 4768 

- -2479288637 

•080 

170 

0*85 

o*925 

-1-96503 1000000 

-1-47412 27000 0000 

- 0*54097 75900 

0-075 

•72 

•86 

•930 

•9948803112 96 

•64261 86986 2912 

0-83042 38503 

•070 

*74 

•87 

*935 

•99769 36954 24 

•78533 649243 776 

1*1087918014 

*065 

•76 

•88 

*940 

•97049 30058 24 

•89660 90278 5024 

*36753 888 3 2 

*060 

•78 

*89 

*945 

•9101487384 96 

•97029 499132288 

•59697 63461 

•055 

r8o 

0-90 

0*950 

-1*81337 6000000 

-1*99975 68000 0000 

-1*78618 62400 

0-050 

•82 

•91 

*955 

•6767304293 76 

•97783 80246 6432 

•9229347755 

•045 

•84 

*92 

.. *960 

•49660 40739 8 4 

•89684 023853056 

•9935819649 

•040 

•86 

*93 

*965 

1*26922 12192 64 

•74849 928543104 

•98298 74516 

•035 

*88 

*94 

*970 

0-9906341724 16 

•52396 09609 4208 

*8744124342 

*030 

1*90 

0*95 

o*975 

-0*65671 9000000 

-1-21375 61000 0000 

-1*6494175900 

0*025 

•92 

•96 

•980 

-0*26317 12194 56 

0-80777 50645 555 2 

1*28775 69045 

•020 

*94 

*97 

*985 

+ 0-19449 8 55S4 56 

-0-2952416248 1536 

0*7672673076 

•015 

*96 

•98 

*99° 

0-72096 9015296 

+ 0-3353 1 37723 8016 

-0*0637540214 

•010 

1-98 

o*99 

0*995 

1*32111 2601664 

1*09608135449472 

+0*8491284802 

•005 

2-00 

1*00 

1*000 

+ 2*00000 00000 00 

+ 2-00000 00000 0000 

+ 2*0000000000 

o-ooo 

-X 

"A* 

i-X 

+C S 

-C, 

+c 8 

X 


2X-I=cos0 c n =2 cos»0=2T n X=cos 2 4@ I-X~an 8 i^ 


200 


C. W. Jones, J. C. P. Miller, J. F. C. Conn, and R. C. Pankhurst 


Chebyshev Polynomials 


TT 

A* 

X 

C. 

c 10 - 

Cn 

Qa 

i -X 



o-noo 

+ 0-00000 00000 

- 2*00000 OOOOO 

- 0-00000 ooooo 

+ 2-00000 OOOOO 

0-500 

yOO 

•ox 

•505 

•1797600864 

I -99000 7997 S 

•2195602463 

1-98561 67928 

•495 

*02 

*02 

•510 

•35808 27633 

•96012 78567 

■43648 78776 

•94266 83416 

•490 

*04 

*03 

* 5 I 5 

•5335409700 

•9106463687 

■64817 97521 

•8717s 55836 

■485 

*0 6 
*o8 

*04- 

*520 

•70472 82850 

•84203 88417 

0-8520913923 

•7738715303 

*480 


0*05 

0*525 

+0*8702691010 

-175496 50999 

-1-04576 56110 

+ 1-65038 85388 

o *475 

o*i° 

*06 

•530 

1*02882 86267 

•65026 39199 

-22686 02971 

•5030406843 

■470 

•12 

*07 

*535 

•17912 265S2 

•52S94 59391 

•39317 50897 

•3339014266 

•465 

•14 

*08 

*540 

•31992 70647 

•3921850814 

•54267 66778 

M 4535 68130 

•460 

•16 

•iS 

•09 

•S 4 S 

•45008 69322 

■2413085564 

•67352 24723 

0-94007 45114 

*455 


0*10 

0*550 

+ 1-5685253120 

-1-0777854976 

-1-78408 24115 

+ 0-72096 90153 

0*450 

0'20 

•IX 

*555 

*67425 15215 

0-9032I 43136 

•87295 86705 

•4911634061 

*445 

*22 

*12 

•560 

•76636 89442 

*7X93088371 

•93900 30651 

•2539481015 

•440 

*24 

•13 

*565 

•84408 22818 

•527^833632 

•9813319562 

+ -0127370546 

*435 

*26 

*28 

•14 

■570 

•90670 42085 

•33083 66794 

•99933 84787 

- -2289780947 

•430 


0*15 

o *575 

+ 1-9536613830 

-0-1301351951 

-1-9927019415 

-0-46767 53874 

0-425 

o* 3 ° 

*16 

*580 

•9844997743 

+ -07220471X2 

•9613942667 

0-69985 08766 

•420 

* 3 2 

•17 

*585 

•99888 92603 

•27413 50044 

•90588 33588 

0-92206 73464 

* 4 X 5 

• 3 J 

•l8 

*590 

•99662 74604 

•4735948334 

•82613 33204 

1-13100 28288 

•410 

•38 

•19 

*595 

■97764 27673 

•6685297634 

•7236014572 

1-32349 83171 

•405 


0*20 

o-6oo 

+ 1-9419965440 

+ 0-85691 I2S76 

-1-5992320410 

-1*49660 40740 

0-400 

0 ' 4 ° 

*21 

*605 

•88988 44578 

1-03675 626IO 

•45444 6S282 

•64762 39289 

*395 

•42 

•22 

■610 

•82163 69238 

•20614 67349 

•29093 23604 

•7741569735 

*390 

*44 

•23 

•615 

*7377186362 

*3632489866 

1-11062 41024 

•8741360737 

• 3«5 

*46 

•48 

*24 

■620 

'6387271683 

•50633 26364 

0-9156875028 

•9458b 26378 

• 3 «o 


0*25 

0*625 

+ 1*52539 06250 

+ 1*6337890625 

-0-70849 60938 

-1-988037x094 

0*375 

o* 5 ° 

•26 

•630 

*3985643391 

•7441491608 

*4916067755 

*99978 46840 

•370 

* 5 3 

*27 

*635 

*25922 66018 

*8361002592 

•26773 246x8 

•98067 57886 

•365 

'<6 

■28 

•640 

1-10847 34282 

•9085O 20243 

- -0397x22946 

*9307409093 

•360 

• 5 ° 

.58 

•29 

•645 

°* 9475 I 2 3573 

•9604OXX994 

+ *1895203384 

*8504794031 

*355 


0 * 3 ° 

0*650 

+0-7776552960 | 

+ 1*99104 50176 

+0-41697 17146 

-1*74086 19889 

0*350 

o*6o 

•31 

-655 

•6003104184 

■99989 31335 

0-63962 33244 

*60332 66724 

*345 

*02 

*32 

•660 

•4169731373 

•98662 79241 

0-85446 87341 

•43976 79343 

*340 

*04 

*33 

*665 

•22921 61721 

•951X6 3OI3I 

1*0585514165 

•25251 90782 

*335 

’Du 

•68 

*34 

*670 

+ *0386787402 

•89364 98804 

1-249003x784 

1*0443277x91 

*330 


o *35 

0*675 

-0*1529450930 

+ 1-81448 24249 

+1*42308 27904 

-0-8183244716 

0*325 

o* 7 ° 

•36 

•680 

*34391 88662 

• 7 H 29 93588 

*5782144045 

•57798 49875 

•320 

7 Z 

*37 

•685 

*53247 69290 

•59398 43199 

•71202 53257 

*32708 55789 

* 3 X 5 

$ 

•38 

•690 

*71683 83524 

*45466 36020 

•82238 26900 

- *0696527577 

•310 

.70 

.78 

*39 

*695 

0*89522 14824 

•29770 14123 

*90742 85840 

+ *1900928832 

•305 

o*8o 

0*40 

0*700 

-1-06585 90720 

+ 1*12469 25824 

+ 1*9656131379 

+ 0*44779 79279 

0*300 

•82 

•41 

•705 

•22701 39x83 

°*93745 26725 

*9957251098 

0*69904 19175 

*295 

* 5 U 

•42 

*710 

•37699 49257 

*73800 54242 

•99691 94821 

0-93940 69407 

•290 

‘04 

•86 

*43 

*715 

•51417 35074 

•5285675376 

•9687415898 

1*16455 02296 

*285 

•88 

*44 

•720 

•6370002319 

•3115307654 

•9x11473054 

1*37027 88634 

*280 

n.QO 

*045 

0*725 

-1*74402 16110 

+ 0-08944 13401 

+ 1-82451 88171 

+ x*S 5263 55953 

0*275 

u 

•02 

*46 

*730 

•83389 69214 

- *1350232301 

•70967 55497 

*70792 47359 

•270 

y* 

•OA 

*47 

*735 

•90541 49407 

*35907 99344 

•56787 98024 

*83288 69486 

•265 

.06 

*48 

•740 

•9575104719 

•57986 85145 

*40083 66980 

•92467 17446 

*260 

0*98 

*49 

*745 

1*98928 05235 

0-79448 20887 

*21068 80766 

1*98095 64037 

*255 

1*00 

0*50 

0-750 

- 2-00000 00000 

-1-ooooo 00000 

+ 1-00000 ooooo 

+ 2*00000 ooooo 

0*250 

-fl? 

“A* 

i-X 

-c. 

+c 10 

-c n 

+c 12 

X 


\ x = n = 2 X-i=cos 0 C„=2 cos = 2T„ X=cos 2 id i-X=sin 2 i 



Tables of Chehyshev Polynomials 

Chebyshev Polynomials 


201 


X 


X 

c 8 

Qo 

Cl! 

^12 

i-X 

1*00 

0-50 

0-750 

- 2-00000 00000 

- I *00000 00000 

+ I-ooooo 00000 

+ 2*00000 OOOOO 

0-250 

*02 

• 5 i 

*755 

1-98913 67538 

•19352 2715s 

07717435837 

1-98070 11712 

*245 

*04 

•52 

•760 

•95636 58359 

•37220 85718 

•52926 89213 

•92264 S2499 

•240 

*06 

*53 

•765 

•9OI58 27767 

•5333121198 

•27627 19297 

•82616 O3653 

*235 

•08 

*54 

•770 

•8249157183 

•67422 38025 

+ -0167540116 

•6923I 81350 

•230 

1*10 

o *55 

0-775 

- I-72673 62090 

-1-7925106399 

-0-2450254949 

+ 1-5229825955 

0-225 

‘12 

•56 

•780 

'60766 84628 

•88595 75732 

•50460 40191 

1-3208010717 

•220 

•14 

*57 

• 78 s 

•46859 68756 

•95260 90688 

75737 74628 

1-0891987612 

•215 

•16 . 

•58 

•790 

•3106715803 

•99081 05426 

0-99866 86491 

0*83235 49096 

•210 

•IS 

*59 

•795 

1*13531 18124 

•9992491171 

1*2238021458 

<>•$$$16 25851 

*205 

1*20 

o*6o 

0*800 

- 0*94420 68480 

-1-9769931776 

-1-4281849651 

+ 0-26317 12195 

0*200 

•22 

•61 

•805 

■7393142650 

•92353 01432 

•60739 25097 

- *0374887186 

*195 

•24 

•62 

*8lO 

•52285 52670 

•83880 18151 

•75725 89837 

*3401993247 

•190 

•26 

•63 

•815 

•2973067997 

•72323 66115 

•8739713307 

•6379672653 

*185 

•28 

•64 

•820 

- -0653901775 

•57777 79421 

•95416 55884 

o* 9235540 iii 

*l80 

1-30 

0-65 

0-825 

+ OT6994 4073O 

-1-4039079151 

-1-99502 43626 

-1*1896237563 

0-175 

•32 

•66 

•830 

■ 40 S 53 54573 

1-2036655087 

•99437 39288 

•42S90 80773 

*170 

*34 

•67 

•835 

■63803 71624 

0-9796582773 

•95077 92540 

•6343s 5923° 

•165 

•36 

•68 

•S4O 

0-86394 44386 

•7350665951 

*86363 500S0 

•7994770157 

*l60 

•38 

•69 

•845 

I-07962 80556 

•47363 93712 

•73325 03S78 

-9182461640 

*155 

1*40 

0*70 

0-850 

+ I-28l37 3l840 

-0*19968 01024 

-1-5609253274 

- 1-9856153559 

0-150 

•42 

•71 

*855 

•465424O696 

+ *0819779465 

1-3490153856 

•9975797940 

*145 

*44 

•72 

•860 

•62803 48763 

*36600 87184 

1-10098 23218 

•9514232618 

•140 

•46 

*73 

•865 

•76552 70898 

•64664 38586 

0*82142 70562 

•84592 73607 

*135 

•48 

*74 

•870 

•87435 38852 

0-9177381710 

■ 0*5161013921 

•6815682313 

•130 

1-50 

o *75 

0*875 

+ I- 95 II 7 18750 

+ 1-17285 15625 

-0-19189 45312 

-1-4606933594 

0-125 

■52 

•76 

•880 

•99292 06690 

■40534 90889 

+ 0-14320 99462 

1-18766 99707 

•120 

*54 

'll 

•885 

•99691 06886 

•60852 08024 

• 0-48021 13471 

0-86899 53279 

*115 

•56 

•78 

•89O 

•96091 96942 

•7757231939 

0-80920 84883 

o* 5 i 335 79521 

•no 

*58 

*79 

•895 

•88329 8496I 

•90054 41171 

1-1195612088 

-0-1316374071 

*105 

1*60 

o-So 

0-900 

+ I76308 6336O 

+1-97699 31776 

+1-4001027482 

+0-26317 12195 

o-ioo 

•62 

•81 

‘ 90 S 

•60013 64373 

•9997196721 

•639409431s 

0-65612 36069 

•095 

•64 

•82 

•910 

•39525 22406 

■9642602621 

•8261345893 

1-03060 04643 

•090 

*66 

•83 

•915 

I-I 5033 485 I 7 

•86731 S6760 

•9494141504 

*36870 88137 

•085 

•68 

•84 

•920 

0*86854 22473 

•7070798391 

■99935 18824 

*65183 13233 

•080 

1*70 

0*85 

0 - 92 S 

+ 0-55446 07970 

+ 1-4835609449 

+ 1-96759 28093 

+ 1*86134 68310 

0*0 75 

■72 

•86 

•930 

+0-21428 96760 

1-19900 20931 

1*84799 39241 

1*97954 74564 

•070 

‘74 

•87 

*935 

-0-14396 12420 

0-85829 92403 

1*63740 19201 

1-99078 01007 

•065 

•76 

•88 

•940 

0-51025 94066 

0-46948 23276 

1*3365483032 

1*88284 26860 

*060 

•78 

•89 

*945 

0*87232 29047 

+0-0442:41575S 

0-95107 29095 

1*64866 82032 

*055 

i-8o 

0*90 

0*950 

-1-21537 84320 

-0-40149 49376 

+ 0*49268 75443 

+ 1-28833 25174 

0-050 

•82 

•91 

•955 

1-52190 32668 

0-84692 91700 

-0*01950 78226 

0*81142 49328 

*045 

•84 

■92 

•960 

I- 77 I 3 S 05769 

1-26570 30966 

0*55754 31208 

+ 0*23982 37543 

*040 

•86 

*93 

•965 

i *93985 73746 

1-6251472651 

1-08291 65386 

- 0*38907 74966 

'035 

•88 

*94 

•970 

1*99993 44153 

1-88546 42666 

1-54473 84058 

P01S64 39364 

•030 

1*90 

0*95 

0*975 

-1-9201373210 

-1-9988433199 

-1-8776649868 

-1*56872 01550 

0*025 

•92 

•96 

•980 

1-66471 81921 

1-90850 20243 

1-9996056945 

1-93074 09093 

*020 

*94 

*97 

•985 

1*1932569519 

1-5476511791 

1-8091863356 

1*96217 03119 

•015 

•96 

•98 

*990 

-0*46027 16544 

-0-83837 84212 

-1*18295 00511 

1-48020 36790 

*010 

1*98 

o *99 

o *995 

+0-5851930364 

+0*30955 37318 

+0-02772 33526 

-0-25466 14937 

*005 

2*00 

1*00 

I *000 

+ 2-00000 00000 

+ 2-00000 00000 

+ 2-00000 OOOOO 

+ 2-00000 OOOOO 

0*000 

-X 


i~X 

-c, 

+ ^10 

-c n 

+ Cj2 

X 


£x=fi= 2X — I = cos 6 C n =2 cos «0 = 2T n X=cos 2 £0 i~X-sin 8 i 



202 


/. C. P . Miller 

Auxiliary Tables 




V4-X 1 

Vz + x 

V 2 -X 

COS" 1 

i x 

2 COS 6 

cos 6 

2 sin 6 

2 cos 

2 sin 10 

0 







radians 

0 

0*00 

0*00 

2*00000 00000 

1*41421 35624 

1-41421 35624 

1-57079 63268 

90-00000 000 

•02 

•01 

1-99989 99975 

•42126 70404 

•4071247279 

*5607961601 

89-42703 266 

•04 

*02 

•99959 996oo 

*42828 56857 

•40000 00000 

*55079 49932 

88-85400 800 

•o 6 

•03 

*99909 97974 

•43527 00094 

•39283 88277 

•5407918250 

88*28086 868 

*08 

*04 

•99839 93595 

•44222 05102 

•38564 06461 

•53078 56524 

87-70755 722 

0*10 

0*05 

1-9974984355 

1-4491376746 

1-3784048752 

1*52077 54700 

87*13401 602 

*12 

•06 

•9963967542 

•4560219779 

•3711309201 

•5107602683 

86*56018 723 

*14 

•07 

•99509 39827 

•46287 38838 

■3638181697 

■50073 90337 

85*98601 278 

*i6 

•08 

•99358 97271 

■46969 38457 

•35646 59966 

•49071 07468 

85*41143 426 

*18 

•09 

•99188 35307 

•47648 23060 

•34907 37563 

•48067 43818 

84-83639 291 

0*20 

o-io 

1-98997 48742 

1-48323 96974 

1-3416407865 

1-47062 89056 

84*26082 952 

•22 

•II 

•9878631744 

■48996 64426 

•33416 64064 

•46057 32768 

83-68468 443 

•24 

*12 

•98554 77834 

•49666 29547 

*32664 99161 

*45050 64444 

83-10789 742 

*26 

*13 

•98302 79877 

•50332 96378 

•3190905958 

•44042 73471 

82-53040 768 

*28 

*14 

•98030 30071 

•50996 68871 

•3114877049 

•43033 49121 

81-95215375 

0*30 

0-15 

1*97737 19933 

1*51657 50888 

1*30384 04810 

1-42022 80540 

81-37307344 

.32 

•16 

•9742340287 

■52315 46212 

•2961481397 

•41010 56738 

80-79310378 

*34 

■17 

•9708881247 

•52970 5S541 

•28840 98727 

•3999666577 

80-21218 694 

•36 

1 * IS 

•96733 32204 

•53622 91496 

■28062 48475 

•3898098755 

79-63024019 

•38 

•19 

•9635681806 

*54272 48621 

•27279 22061 

•37963 41S03 

79-04721580 

0*40 

0*20 

1-95959 17942 

t -54919 333«5 

1*26491 10641 

1'36943 84060 

78-46304097 

•42 i 

•21 

•9554027718 

•55563 49186 

•25698 05090 

■35922 >3670 

77-87764 776 

•44 

*22 

•9509997437 

•5620499352 

•24899 95997 

•3489818563 

77-29096 701 

*46 

■23 

•94638 12576 

•56843 87141 

•24096 73646 

•33871 86439 

76*70292 825 

*48 

■24 

•9415457759 

•57480 15748 

•23288 28006 

*32843 0475« 

76-11345964 

0-50 

0 *2S 

1-93649 16731 

1*58113 883OI 

1-22474 48714 

1-31811 60717 

75-52248 781 

•52 

•26 

•93121 72327 

•58745 07866 

•21655 25061 

•3077741239 

74-92993 786 

•54 

•27 

•9257206443 

*59373 77451 

•2083045974 

•29740 32953 

74'33573 3*5 

•56 

*28 

•92000 00000 

•60000 00000 

•20000 00000 

•2870022176 

7373979 5*9 

•58 

•29 

•91405 32908 

•60623 78404 

•1916375288 

•27656 94890 

73-14204398 

o-6o 

0*30 

1-90787 84028 

1*61245 15497 

1-18321 59566 

1*2661036728 

72-54239 688 

•62 

•3 1 

*9014731131 

*61864 14056 

*17473 40124 

•2556032944 

71-94076951 

•64 

.32 

•89483 50852 

•62480 76809 

•1661903790 

■2450668395 

7 I -33707 5 I 2 

*66 

•33 

*8879618640 

•63095 06430 

*15758 36903 

*23449 275*6 

7073122 451 

•68 

■34 

•88085 08713 

■63707 05544 

•14891 25293 

*22387 94293 

70-12312593 

0*70 

o -35 

1*8734993995 

1*6431676725 

1*1401754251 

1*21322 52231 

69*51268 489 

•72 

•36 

•86590 46064 

•64924 22502 

•13137 08499 

•20252 84334 

68*89980 398 

•74 

*37 

•8580635081 

* 655 2 945357 

*1224972160 

•19178 73061 

68*28438 272 

•76 

•38 

„ *84997 29728 

•66132 47726 

*11355 28726 

•18100 00303 

67*66631 734 

•78 

•39 

■ *8416297131 

■66733 32001 

*1045361017 

•1701647341 

67*04550 060 

0*80 

0*40 

1*83303 02780 

1-6733200531 

1*0954451150 

1*15927 94807 

66-42182 152 

•82 

*41 

*82417 10446 

*67928 55624 

*08627 80491 

*1483422646 

65*79516520 

•84 

•42 

*8150482087 

•68522 99546 

*07703 29614 

•13735 10067 

65-16541 251 

*86 

*43 

*80565 77749 

•6911534525 

•06770 78252 

*1263035499 

64*53243 986 

*88 

*44 

‘79599 55457 

*69705 62748 

•0583005244 

*11519 76534 

63*89611 886 

0*90 

o *45 

1*78605 71099 

1*70293 86366 

1*0488088482 

1*10403 09877 

63*25631 605 

*92 

•46 

*77583 78304 

•70880 07491 

•03923 04845 

*0928011283 

62-61289 250 

•94 

•47 

•76533 28298 

•71464 28199 

•0295630141 

*08150 55488 

61-96570 347 

*96 

*48 

*75453 69760 

•72046 50534 

•01980 39027 

•0701416144 

6 I- 3 I 459 799 

0*98 

•49 

*7434448658 

•72626 76502 

•00995 04938 

*0587065739 

60-65941 842 

1*00 

0*50 

1*73205 08076 

1-7320508076 

1*00000 00000 

1*04719 75512 

60-00000 000 


Two Numerical Applications of Chebyshev Polynomials 

Auxiliary Tables 


203 


X 


1 

> 

V 2+X 

V 2 -X 

COS" 1 

\x 

2 COS 0 

cos 0 

2 sin 6 

2 COS \Q 

2 sin J0 

6 






radians 

0 

1*00 

0*50 

1*73205 08076 

1*73205 08076 

1*00000 OOOOO 

1-0471975512 

60*00000 000 

*02 

*51 

•72034 88018 

•7378147197 

0-9899494937 

•0356115365 

59*33617026 

*04 

•52 

•70833 25203 

•74355 95774 

•97979 58971 

•02394 53761 

58-66774 850 

•06 

*53 

■69599 52830 

•74928 55685 

■96953 59715 

•0x21957615 

57-99454517 

*08 

*54 

•68333 00330 

•75499 28775 

•9591663047 

1*00035 92174 

57-31636115 

1*10 

o*55 

1-67032 93088 

I*76068 16862 

0-94868 32981 

0-98843 20889 

56-63298 703 

*12 

•56 

•65698 52142 

•76635 21733 

•9380831520 

*97641 05268 

55*94420 226 

*14 

*57 

•64328 93841 

•7720045147 

•9273618495 

•9642904716 

55-24977 425 

*l6 

•58 

•62923 29484 

•77763 88835 

•9165151390 

•95206 76361 

54-54945 736 

*l8 

*59 

*61480 64900 

•78325 54500 

•90553 S513S 

•93973 74860 

53*84299 180 

1*20 

o*6o 

1*6000000000 

I-78885 43820 

0-89442 71910 

0-92729 52180 

53*13010 235 

*22 

•61 

*58480 28269 

•79443 58445 

•88317 60866 

•91473 S73S9 

52-41049704 

*24 

•62 

•5692036197 

•80000 OOOOO 

•87177 97887 

•90205 36236 

51-68386553 

•26 

•63 

•5531902652 

•8055470085 

■86023 25267 

•88924 31152 

50-94987746 

•28 

•64 

•5367498170 

•81107 70276 

•8485281374 

•87629 80612 

50-208l8050 

1*30 

0*65 

1-5198684154 

1-81659 02125 

0*83666 00265 

0-8632I I89OI 

49-45839813 

.32 

•66 

•5025311977 

•82208 67158 

•82462 11251 

•84997 75659 

4870012 721 

*34 

•67 

•48472 21962 

•82756 66882 

•8124038405 

•8365875393 

47-93293 520 

*36 

•68 

*46642 42224 

•83303 02780 

•80000 OOOOO 

*82303 3692r 

47-15635696 

•38 

•69 

•4476187343 

•8384776311 

•7874007874 

*80930 72740 

46-36989113 

1*40 

0-70 

1*4282856857 

1-8439088915 

0-7745966692 

0*79539 88302 

45-57299600 

•42 

•71 

•4084033513 

•84932 42009 

•7615773 iq 6 

*7812981174 

44-76508467 

*44 

•72 

•3879481258 

•85472 36991 

•7483314774 

•76699 40079 

43-9455I956 

•46 

*73 

•36689 42900 

•8601075238 

•7348469228 

•7524743762 

43-11360595 

.48 

*74 

•3452137377 

•86547 58106 

•72m 02551 

•73772 59685 

42-26858 443 

1*50 

o*75 1 

1-32287 56555 

1-87082 86934 

07071067812 

0*72273 42478 

41-40962 211 

*52 

•76 

•2998461447 

•8761663039 

•69282 03230 

•7074832118 

40-53580211 

*54 

*77 

•2760877713 

*88148 87722 

•67823 29983 

•6919551751 

39-64611115 

*56 

•78 

•2515590278 

•88679 62264 

*6633249581 

•67613 05096 

38-73942 460 

•58 

*79 

•22621 36845 

•89208 87928 

•64807 40698 

•65998 73294 

37-81448851 

1*60 

o*8o 

1-20000 OOOOO 

1*8973665961 

0-63245 55320 

0-64350 11088 

36-86989 765 

•62 

•81 

•17285 97529 

•90262 97590 

•61644 14003 

•6266442116 

35-90406 858 

•64 

*82 

•X447270417 

*90787 84028 

•60000 OOOOO 

•60938 53080 

34-91520625 

•66 

*83 

•II552 67814 

*91311 26470 

•58309 5IS95 

•59168 86424 

33*90126 200 

•68 

•84 

•08517 27973 

■91833 26093 

*56568 54249 

*5735 1 3 I0 44 

32*85988 038 

1*70 

0-85 

1*0535653753 

1*92353 84062 

0*54772 25575 

0*5548110330 

31*78833 062 

*72 

•86 

I-02058 80658 

•92873 01522 

•52915 02622 

*53552 66543 

30-68341 in 

*74 

•87 

0*9861034428 

•9339079606 

•5099019514 

*5*559 40062 

29*54136 050 

*76 

*88 

•94994 73670 

•9390719430 

•48989 79486 

*494934 x 263 

28*35763 658 

•78 

*89 

•91192 10492 

•94422 22095 

•46904 15760 

*47345 1 x 573 

27*1267 5 312 

1*80 

0-90 

0-87177 97887 

1*9493588690 

0-44721 35955 

0-45102 68118 

25*84193 276 

•82 

•91 

*82921 64977 

*95448 20286 

*42426 40687 

*42751 22649 

24*49464847 

*84 

•92 

*78383 67177 

*9595917942 

•40000 OOOOO 

•40271 58416 

23*07391807 

*86 

*93 

*735* 1 90380 

•96468 82704 

•37416 57387 

•37638 34823 

21*56518502 

•88 

*94 

*68234 88844 

•9697715604 

•34641 01615 

*3481660213 

19-94844359 

I*Q0 

°*95 

0*62449 97998 

1-9748417658 

0*31622 77660 

0*3175604293 

18*19487 234 

•92 

•96 

*56000 OOOOO 

■97989 89873 

*28284 27125 

*2837941092 

16*26020471 

*94 

*97 

•48620 98312 

•98494 33241 

•24494 89743 

*2455655175 

14*06986 775 

•96 

•98 

*3979949748 

•9899748742 

*20000 OOOOO 

*20033 48423 

11*47834095 

1*98 

o*99 

•2821347196 

1*99499 37343 

*1414213562 

•X4I53 94733 

8*10961446 

2*00 

1*00 

0*00000 OOOOO 

2*00000 OOOOO 

0-00000 00000 

0*00000 OOOOO 

0*00000 000 


204 


/. C. P. Miller 


XXII.— Two Numerical Applications of Chebyshev Polynomials. By J. C. P. Miller, 

Ph.D., University of Liverpool. Communicated by Dr A. C. Aitken, P.R.S. 

(MS. received October 23, 1944. Read May 7, 1945) 

i. Introduction 

In this paper two computational processes are outlined in which the table of Chebyshev 
Polynomials C n (#) = 2 cos (n cos” 1 \x) given in the preceding paper may be used with effect; 
these processes are (a) interpolation and ( b ) Fourier synthesis. A brief outline is also given 
of the idea behind the process of “Economization of Power Series” developed in Lanczos, 
1938; this is related to (a). Finally the application of ( 5 ) to the calculation of Mathieu 
functions is considered. 

Ten decimal tables are given of the auxiliary functions needed for Fourier synthesis. 
These functions are V4 -x 2 = 2 sin 6 , V 2 +x-2 cos V2-x—2 sin \Q\ cos" 1 !#^ is 
given also, to 10 decimals of a radian and to 8 decimals of a degree. 

Throughout this paper the notation used is that of the preceding paper (in particular § 2). 

2. Economisation of Power Series 

The reader is referred to the valuable and very readable paper of Lanczos, 1938, for a 
full discussion of this process, which is closely related to the interpolation process described 
in § 3; only a brief indication of the underlying principle can be given here. Lanczos’s 
process replaces a power series (possibly asymptotic in character) by a derived series in which 
Chebyshev polynomials are used in place of powers, and which in general requires fewer, 
often many fewer, terms to give results with a specified accuracy over a given range of the 
argument. Instead of using this series in conjunction with tables of Chebyshev Polynomials, 
Lanczos then curtails it at the appropriate term, say the jwth, and reconverts to a power 
series, which includes powers up to the mth only; the coefficients are modified and depend, 
of course, on m . Nevertheless, within the specified accuracy and range of argument, the 
modified series is equivalent to the original power series carried perhaps to many more terms. 
Thus, for example, in terms of T n (/a) = cos (n cos -1 fi), Lanczos (p. 156) finds: 

i/(i+/z) = i -/r + ja 2 -/r 3 + . . . 

= -/T^) +/*T 2 ( m ) -/ 3 T 3 (ju) + . . .} 

in which/ = 3 ~ 2 V 2 =0-17157. Retaining terms to T 6 (/x) in the latter series, 

I I( I +/*•) °’ 99999 2 S ~ 0-9992202 ft . + 0-9863809 /P ~ 0-9070856/a 3 + 0-6753632JU. 4 - 0-3293104^® 

+ 0-0738 853^® 

in the range o <74 < 1, with an error which nowhere exceeds 6 x io~ 6 within this range. 
For comparison, the following table gives N, the number of terms of the original power series 
needed for similar accuracy in the range - M < ju < + M. 

M o-i 0-5 o-8 0-9 

N 6 17 53 114 

3. Interpolation by Chebyshev Polynomials 
3.1. It is a well-known property of the Chebyshev Polynomials that, of all polynomials 

J> n (x)=x n + a 1 x n ~ 1 + ... +a n 

with leading coefficient unity, C n (pc) is the one for which the greatest departure from zero in 



Two Numerical Applications of Chebyshev Polynomials 205 

the interval - 2 < # < + 2 is as small as possible. This property can be used with effect for 
interpolation. The power series 

f(T)=f(a+t)=A 0 +A 1 .j+A 2 .(^j + . . . +A s .(^j + . . . (3.11) 

may be replaced over the interval -2 <4 tjh < +2 (that is, over -\h < +-§/z, or 

a - \h < T < a + \Ii) by the series 

/(T)=/(^ + /) = a 0 + a 1 C 1 (4///2)+a 2 C 2 (4///^) + . . . + a J} C 3) (4^M) + * * • (3.12) 

We shall see below (3.23, 3.24) that in general (i.e. if h is not too great) and a^ are quantities 
of similar magnitude. Thus, in the interval a - \h < T < a + ^b, the series (3.12) is more 
rapidly convergent than (3.11) when 1 < | 4 ////1 < 2. It is true that (3.11) is more rapidly 
convergent when J ^tjh | < 1, but to retain the degree of convergence of (3.12) when 
| 4 tjh | < 2, the power series could be used only over the range | 4 t/h | < 1; to cover a 
large range of T with a given maximum number of terms, it would thus be necessary to 
provide series of type (3.11) for twice as many values of x } i.e. at half the interval h in x, as 
are needed for series of type (3.12). 

3.2. In terms of derivatives, the coefficients a v may be determined as follows, using the 
calculus of operators. Writing t — \h cos 0 , and assuming 


« /4A “ 

f (a+l) — a Q + 2 mJ a 3) C J y ) = a 0 4 - 2^^ cos p 8 , 

p = l / 2 > = 1 

it follows (cf. Watson, 1922, p. 181) that 

1 r 

0,3, = - f(a + -l/i cos 0) cospd dd 
ttJ 0 

= - f e* A cos cos pd d 9 f{a) 

TrJ 0 

==I*(pD)./(a), 

in which D n f(a) is written for | f(a + i) j* . Thus 


( 3 - 2 1) 


(3-22) 


Since 


(IWf 




(P ) 4 


ll(J>+l) J V “' ' 2[(j> + l)(p + z) 

aw 


f<p+ 4 >(a)+ . 




•j- (3-23) 
(3-24) 


it follows that, if h is not too large, a. v = A v: as stated above. 

3.3. The coefficients a P may also be expressed in terms of central differences of f(a + 6 li)* 
for 0 = o, ±1, ±2, . . . -This may be done in two ways: (a) by substitution for the f [v) (a) 
of equivalent expressions in terms of central differences, see, for example, Comrie, 1936, 
p. 802, or, if more coefficients are needed, see Oppolzer, 1880, pp. 21, 23; ( b) by writing 
0 = £x in Stirling’s interpolation formula 

Q2 8(8^ — A 0 2 (0 2 — 1 ) 

fe~f( aJ r ~fo + 0P '§/'(0 + 0 + S% + . . ., (3-3 1 ) 


or by writing 8 = in Bessel’s interpolation formula 


U —f( a + 8 h) = fjf± + ifjSfb + 



( 3-3 2) 


and then substituting for powers of x their expressions in terms of the polynomials C n (x ); 
these expressions are given in section 3.3 of the preceding paper. 

* Note the use of the customary 6 for a fraction or multiple of the tabular interval h \ no confusion with 
0 = cos -1 £x should arise. 



20 6 


/. C. P. Miller 


The coefficients a P are given below in two forms, (i) suitable for the interval (a - a + \Ii) 

of the argument a + dk, that is for < d < + %: coefficients derived from Stirling's formula 

(3.31) and checked by process (a); (ii) suitable for the tabular interval (a, a+ 3 ), that is for 
o < 6 < 1 so that 0 = X of the preceding paper: coefficients derived from Bessel’s formula 

(3.32) and checked by process (a), but with substitution for f {v) (a + \h) instead of f (p) (a). 


f{a 4- Qft) = glq 4* + <x a C a (40) 4* • • * 4-cc 3 >C J >(4 0)+ , . 


— J < 0 < + J 


ig - ifa + _ J _ 8 z _ | _ 794 _g 6 _ i 12029 283 547821 12598 67730 , 12 

*° ^ 2 4 .2! 2 s .41 2 12 .6! 2 16 .8! + 2 20 .io! 2 24 -ts! 


a x = 

a 2 






2 14 .7! 


2 18 .9! 


2 84 . 12 
I 12598 67730 

2 22 .III 


. . . 


2 4 .2 l 


12 m . s » 1 °°S 8 V , 2 33 355 Q 6 , in I 00297 74536 ^ + _ 

2 8 .4! 2 12 .6! 2 16 .8! 2 20 .lo! 2 a4 .I2l 


«3 = ,e 

a 4 = 

«s = 
a 6 = 

a 7 = 

a 8 = 


2 e . 3 l 

I 


/* 5 3 - 


75 


2 io. 5 r 2 i 4 . 7 ! j 




2 18 . 9 l 


2 s . 4! 

I 

2 10 . 5 r 

I 

2 12 . 6 >' 

I 

1 


#- 


74 4 29 J 27I > ° 1 I20 ^ - 8 q 975^ 12 _ 


2 12 . 6 ! 2 16 . 8 ! 


2 20 .101 


2 24 . 12 ! 




2 14 »7! 
216 


2 18 . 9 ^ 


2 za . 1 X 1 


2 16 .81 


» + “SW>-ggSgig6 J11+ _ _ . 


2 20 .IO! 


2 24 . 12! 


• • • 


2 18 .9! 


a» = 


1 • • • 


;g .-^Z 2 _ 3 » + 153 154 ^ _ 


2 18 .9! 

I 


2 16 .8! 2 20 .io! 2 24 .12! 


2 2a . IO! 


d 10 - 


2 22 .Ilf 

868 

2 24 . I2l 


! + . 


f{<z + Oh) — a 0 -f a 1 C 1 (40 — 2) -l- "" 2) 4 • • * 4 - ct^C P (46 — 2) 4 - ... 0 6 ^ 1 




35 _ 3466 6 76003 _ 2183 7 8238 + 10 S 48 6 ? 8990? ^, _ 

^ ~32 H \ t Ai r „ 1 B U» i ^ 2 20 IQ! ^ rt24 f 


2 8 .4! 


2 l 2 . 6 r 


2 16 . 8 ! 


2 24 .12 1 


«l = 


1 <5 _ _J_ i(5 3 +t #_5= - # 33 ^, + 6 _ 4 f 734 ffl _ 2 o 82_7 7 290 yi + 


2 2 .l! 

I 

2 4 .2! 

~a»- 

2®.3! 2 


2®.3! 2 10 .5! 


‘. 7 i 


2 18 .9! 


2 22 . 11 ! 




35 y , 3465 r 6 75676 


2 14 .7» 


., 2182 43674, 311 


^ _ 1 Si 134 S« , 29596 
4 2 s . 4!^ 2 12 .6!^ + 2 1 ®.8! / 


2 1S . 91 ~ ' 2 22 . 1 1 ! 

»- 102 355 !:2 51621 31183 _ 

20 TOt ^ + 2 24 . 12 ! ^ * * 


2 20 .IO! 


a 6 = 


1 - d >- +? 2£^ 9 -12122222511 + 


2 10 . 5 l 2 14 . 7 l 2 18 - 9 ! 


2 22 .III 


a« = 


2 12 . 6 ! 


... 


2 i 8 . 8 r 


2 24 . 12 ! 


a 7 = __i_ ^ + L315633U _ 

7 2 14 .71 2 18 .9! 2 22 .ii! 


a fl = 


a* = 


2 18 . 8 r 


jud B - 




2 2 °.ior 


.12V 


2 18 .p! 

1 

’2 20 .10 


( 5 ® - 


649 


2 22 .11! 


.<5114. 


-^ io - 


1132 

2 a4 .I2 


■/.18™ + . . . 


Denominators 



4 n .«! 

n 

I 

4 = 2 s 

7 

2 

32=2® 

8 

3 

384 = 2 ’,3 

9 

4 

6144=2“.3 

10 

5 

1 2288o=2 1s .3.5 

11 

6 

29 49120=2 W .3 2 .5 

12 


4 ”.»! 


825 75360 

— 2 W. 32 

5-7 

26424 11520 

= 2 23 .3® 

5-7 

9 51268 14720 

= 2 25 . 3 4 

5-7 

380 50725 88800 

= 2». 3 ‘ 

■ 5 s -7 

16742 31939 07200 

= 2*». 3* 

• 5 s .7.11 

8 03631 33075 45600 

— 2 84 ,3® 

• 5 a . 7 .11 





Two Numerical Applicatio 7 is of Chehyshev Polynomials 207 

A study of the coefficients makes it clear that, even if high differences are needed for 
interpolation, a smaller number of a n will usually suffice. If much interpolation within each 
interval of tabulation is likely, it may, therefore, be better first to evaluate the first few a w — 
as far as needed—and to use these in conjunction with a table of Chebyshev Polynomials. 

3.4. Numerical Illustration .—As an example take 

f(a+f) = y/2 cos (br + t), with h = \rr. 

The formula (3.22) then gives the following convenient expansion of/(T) for 

o < T = |-t7 + / < \i r, 

that is for — £ < d~ 2tjir < 4 - J, 


V^cos (iir + |77-0)=Jo(i7r)-J 1 (j7r)C 1 (40)-J 2 (j7r)C 2 (45)+J 3 (iw)C 3 (40)+ - - + + . . . 
The power series in 6 is exhibited alongside for comparison. 


V 2 COS 

(J7r + |7T0) = 0*85163 

19137 0 

= 1 



-•36318 

78383 5 Q 

~o 

•78539 81634 0 (2 d) 


- 7321 

83222 0 C 2 

- 

•30842 51375 3 (20) 2 


4 - 971 

00145 3 C 3 

4- 

8074 55I2I 9 (2d) 3 


4 ? 9 b 

07246 6 C 4 

+ 

*585 4344 2 4 (a0) 4 


7 

58464 6 C 5 

- 

249 03945 7 (2d ) 5 


- 

49824 8 C 6 

- 

32 599x8 9 ( 2 d ) 6 


4 - 

2802 9 C 7 

+ 

3 65762 0 (20) 7 


4 

i 37 9 C 8 

4 - 

35908 6 (2 d ) 3 


- 

6 O Cg 

- 

3133 6 (2d) 9 


- 

2 C 10 

- 

246 1 (2d) 16 




4 

176 (2d) 11 




4 

1 1 (2d) 13 




- 

1 (2d) 13 

When | 6 | < i, | 

C w (4 0 ) | < 2 while 

| 2 d |* < 1; 

the power series has more terms for 


£ < | 2 d | < 1 , that is over half the total range. 

The interval chosen, would be too large for the convenient tabulation of /(T), but 

the example illustrates the principle that a table of the coefficients at a given interval, 
of the argument would be smaller than a table of function and derivatives giving equal 
accuracy. As shown by the example in section 2 , the saving may be even greater when the 
power series has radius of convergence finite, or zero (as with asymptotic series). 

4. Fourier or Harmonic Synthesis 

4 . 1 . A table of Chebyshev Polynomials may also be useful in conjunction with Fourier 
expansions, such as 

f(x) =/(2 cos 0) = a 0 4- 2 a x cos 0 4* 2 a 2 cos 204- . . . (4- 11 ) 

4 - 2 /?! sin 6 + 2^2 sin 20 + . . . 

If values of f(x) are needed for which x —2 cos 0 is a tabular argument—that is, if ioox is an 
integer as with the tables given in the preceding paper—the expansion may be written 


f (1 x) — (Zq 4 - Ckflfpc) 4 “ ct 2 C 2 (#) 4 “ ctgCg^) 4 • * • 


4 ^iipc) 4 - jS 2 a 2 (tf) 4- p z cr 3 (x) 4 - • • 

(4.12) 

= ct-o 4 - diCfx) 4 - a 2 C 2 (tf) 4 * a 3 C 3 (#) 4 - . * • 


4 -'V / 4 “A; 2 {j 8 1 +jS 2 S 1 (^) 4 -^ 3 S 2 (^) 4 - . . .}. 

(4-13) 


Tables of a v (x) are not available, nor are tables of S v (x); it is hoped later to supplement 
the table of C p (x) with a similar table giving S P (x) for/= 2(1)11 or 12. 



208 


/. C. P. Miller 


The need for tables of S^(x) may be circumvented, however, by means of the formulae 
(3.72) of the preceding paper. 

S,-C, + <V i + C^ t + • • • ( 4 * I 4 ) 

with C 2 or 1 =|C 0 as the final term. This gives 


in which 


f(x) = a (i + a 1 C 1 (x)+a 2 C 2 (x) + a z C z (x) + ... 

+ V4-x 2 {p i +P 2 C 1 (x)+p 3 C z (x) + i+p i C 3 (x) + C 1 (x)+ . . .} 
= a 0 + aiCi^) + a 2 C 2 (^) + a 3 C 3 (^) + . . . 


+ V 4 - a5 2 {y x + YiC-^x) + y 3 C 2 («) + y 4 C 3 (x) + . . .} 

( 4 - 15 ) 

yi-ft+ft+ft+ . . . 

y a =/J 2 +/?4+/3 6 + ••• j 


y»=yi-&=&+& • • • 

y4=y2-/3 2 =i3 4 +^ 6 + • • • - 

( 4 . 16 ) 

y5=ys-&=&+&+ • • 

y 6 = y 4 ~ft = ft-f ft-h . . .J 



A table of V4-x 2 is given to help in the application of (4*15), see p. 202. 

It may be noted that for work with a few figures only, up to perhaps 5 or 6, it is better 

to use jSjOjX#) 4- jS 2 <r 2 (jc) + ... for computation, but for many-figure work, in which x is an 
exact 2-decimal value, it is of some advantage to multiply ft, ft, ft, ft ... by the o-, 2-, 4-, 
6 -, . . . decimal values 1, S 2 , S 3 , S 4 , . . . instead of by the many-figure a l9 cr 2 , <r 3 , cr 4 , . . .; 

the single final multiplication by V4-x* then applies to the sum of the whole series, and 

riot to ft alone. 

4.2. If 100 sin 0 is an integer, it is best to substitute 0 = §7t-<£ immediately, so that the 
expansion, say. 


becomes 


/(2 sin 8 ) =£ 0 + 2 a x cos 8 + 2a 2 cos 20 + 2a z cos 30 + 
+ 2ft sin 0 + 2 b 2 sin 20 + 2^3 sin 30 + . 

/(2 COS (j>) = a 0 + 2 ft COS (f) - 2<2 2 COS 2 <j> - 2^3 cos 3<f> + 
+ 2 a x sin <p + 2ft sin 2 <f>- 2a z sin 39^ - 



(4.21) 


(4.22) 


in which 100 cos <p is integral, and can be dealt with as in section 4.1, compare formulae (12) 
on p. 81 of Van der Pol and Weljers, 1933. 

4.3. Processes similar to that described in section 4.1 may also be applied to the series 


g(pc) =g{2 cos 0 )= K X cos \8 + K Z cos f 0 + /c 5 cos §0 + . . . (4.31) 

h(x) = h(2 cos 0) -il x sin \8+(jl z sin f 0 +/x 5 sin §0 + . . . (4.32) 

in which terms k 2v cos p8 or sin p8, with p integral, have been omitted from consideration; 

if they do occur, the process of section 4.1 is immediately applicable. Series of this type 

arise as part of normal Fourier series in which the variable is ijj = | 0 , but for which x= 2 cos 21ft 
is taken as variable rather than 2 cos ip. 

Using (2.22) of the preceding paper, (4.31) becomes 


g(x) —cos £02 K 2 J>+i{S„(*) - s 3) _ 1 (*)} 

— 2 -f K Z (C X — l) + K$(C 2 — C x + 1) + /C 7 (C 3 — C 2 + C 4 — 1) + . . .} 

= jV2+^(A 1 +A3Q +A 5 C 2 +A7C3+ . . .) 
in which # = 2 cos 0 and 


( 4 . 33 ) 


A x = K X - K z + K 5 - ~ . . . 

A3 = Ki - A x = /C 3 - Kg + /C 7 - . . . 

A5 = *3 “ Ag = /C 5 - /C 7 + Kg - . . . 


( 4 . 34 ) 



209 


Two Numerical Applications of Chebyshev Poly7iomiah 
Likewise (4.32) gives 


>&(*)= sin + S«(*)} 

p=0 

— — x {^i ■PMsCQl +!) + ^5(^2 4 - Ci +1) +^7(03 + 02 + 0 ^ + !) + 

= |'V / 2~^(v 1 + v 3 C 1 + v 5 C 2 + v 7 C 3 + . . .) 
in which, again, x = 2 cos 0 and 


jh (+ 35 ) 


^ = ^ + ^ + ^ + ^+^9+ . . A 

i/ 3 = v 1 - i ai = f4 3 + ^ 5 +^ 7 + • • *r (4-36) 

— ^3 4/^7 + ^9 + ■ • • J 

Tables of V2 + x and V2 - # are given to help in the application of (4.33) and (4.35), see p. 202. 


5. Application to Mathieu Functions 

5.1. As an application of these processes of Fourier Synthesis, consider the calculation 
of Mathieu Functions. These satisfy the differential equation 

d 2 y 

-^- + (a~2^cos 2t)y~o. (5.11) 


Periodic solutions exist for particular characteristic values of a , depending on q ; these 
solutions may be expressed in the form (see Ince, 1932, p. 356). 

ce 2 n (t, q) = SA 23 , cos 2pt 
se 2n+i(t> $0 — SB 2j)+1 sin (2/ + i)t 

"an+i.(*, 5 ) = SA 29+1 cos (2 \p + i)t ' (S ' 12 

"gn+sfo ?) = SB 23)+2 sin (2/ + 2 )/_ 

all summations being for/ = o, i, 2, . . . -> co 5 the A’s and BV being constants depending 
on n and q. 

If in (5.11) we substitute = cos 2 t, the equation becomes 


(5.13) 

This equation, and the rather similar one in which X = Kfi +1) - cos 2 1 is taken as independent 
variable, are not uncommon as variants of Mathieu’s equation (see, for example, Stratton, 
etc., 1941, p. 260=2). In terms of the variable x — 2\i — 2 cos 2/, (5.13) may be re-written as 


d 2 y dy 

^- x ^- x j x + ^ a - qx)y=0 - 


(5-14) 


The expansions (5.12) may then be written in a form suitable for giving solutions of (5.14), 
or of (5.13), 

^2n(^? == ^^-22? oospd^l'ZA^Cfx) 

se 2 n+ 1 (t, q) = SB 2 , +1 sin (p + 1)6 = XB 23?+1 sin id{S P (x) + S^fx)} 

= 2 — ^X(B 22 ,^_3 + B 2j> ^_ 3 + B 22 >45 4 • • *)C <p(pd) ( 5 *^ 5 ^) 

$) ~ ^■^■ 23 )+l (P + ~ ^'^■235+X ^OS ~~ ^ 2 >— 

= 2 + x 2(A 23J+ 3 — A 2i p+ 3 + A 22> ^.5 — • . .)C p(x) (5*^53) 

^2w+2(^j $) == ^^235+2 s * n (fi 4 -1)0 = SB 23 , +2 sin 6 Sg(x) 

— \\/ 4 — ^ 2 S(B 23 , + 2 + B 23 , + 6 + B 233 + io+ • * •)C'»(*> ( 5 . 154 ) 

where, in each case, C'fx) - Cj>(x), p > 1; C' 0 (#) — |C 0 (#) = 1- 



2X0 


Two Numerical Applications of Chebyshev Polynomials 

5.2. Numerical Illustrations .— Application of (5.152) and (5 * 1 53 ) the evaluation of 
^3(^77,10) and ce^TT, 10). In examining these numerical examples it should be noted that 
the coefficients B' 23)+1 and A ' 2p+1 apply to all values of the argument and not only to the 
particular values, or r, chosen here; again, the individual items in the columns of products 
need not be recorded if a calculating machine is available. 


xo)=|V2 

-x SB^p-KiC'fl 
0-2 t=^ 7 T 

(x), where B ' 23J+1 = B 

X — 2 COS 2 t=l 

223+1 B233+3 "b B233+5 + • • . 

2 sin / = V 2-x—i. 

i> 

b 23 , +1 * 

B / 2d+1 

C',(i> 

Products 

O 

+ 0*43239 So 

+0*77165 62 

JC 0 = + 1 

+ 0*77165 62 

I 

+ 073446 92 

+ 0*33926 12 

C x = + i 

+0*33926 12 

2 

-0*50686 51 

-0*39520 80 

C a = - 1 

+0*39520 80 

3 

+ 0*12790 76 

+ 0 *IIl 65 71 

C3 = — 2 

-0*22331 42 

4 

- 1773 44 

1625 05 

c 4 = -1 

+ 1625 05 

5 

+ 157 79 

+ 148 39 

C 5 = + 1 

+ 148 39 

6 

9 83 

9 40 

C 6 = + 2 

- 18 80 

7 

+ 45 

+ 43 

C 7 = + I 

+ 43 

8 

- 2 

2 

Cq — — I 

+ 2 

+ 1*30036 21 

Thus se 3 Q7r, 10) = +0-65018 10+ Ince, 1932, p. 

398, gives +0-65018. 

ce^TT, 16) 

= ^^2 -j~oc SA / j 
0 = 2 / — §77 

233+1 ^ 33W; where A 2p+i ~ A22+1 ^233+3 4 * A 2jp+ g 4 * 

# = 2 cos 2/ = — 1 2 sin t = V2 +^c = 1. 

p 

A . * 

** 259+1 

A^p+i 

C'„( - 1) 

Products 

0 

+ 075526 89 

-0*31466 20 

iC 0 = +1 

-0*31466 20 

I 

+ 0*34008 13 

+ 1*06993 09 

Ci = -1 

-1 *06993 09 

2 

-0*53412 14 

-0*72984 96 

C 2 = -i 

+ 0*72984 96 

3 

+ 0*16718 53 

+ 0*19572 82 

C 3 = +2 

+0*39145 64 

4 

- 2590 28 

2854 29 

c«= - I 

+ 2854 29 

5 

+ 247 06 

+ 264 01 

c 5 =-t 

264 01 

6 

16 15 

1695 

C 6 = + 2 

33 90 

7 

+ 77 

+ 80 

C 7 = -1 

80 

8 

3 

3 

C 8 = “ i 

+ 3 


-0*23773 08 

Thus o? a (|7T, 10) * - 0-11886 54. Ince, 1932, p. 403, gives -0-11887. 


REFERENCES TO LITERATURE 

COMRIE, L. J., 1936. Interpolation and Allied Tables . Reprinted from Nautical Almanac for 1937, 
and issued separately. Revised reprint, 1942. H.M. Stationery Office, London. 

Ince, E. L., 1932. “Tables of the Elliptic Cylinder Functions,” Proc . Roy. Soc. Edin. f LIl, 355-423. 

Lanczos, C., 1938. “Trigonometric Interpolation of Empirical and Analytical Functions,” Journ* 
Math. Phys ., XVII, 123-199. 

Oppolzer, T. R. VON, 1880. Lehrbuch zur Bahnbestimmung der Kometen und Planeten, Zweiter 
Band, Engelmann, Leipzig. 

Stratton, J. A., Morse, P. M„ Chu, L. J., and Hutner, R. A., 1941. Elliptic Cylinder and 
Spheroidal Wave Functions , including Tables of Separation Constants and Coefficients. (Pages 
1-51 are reprinted from Journ. Math. Phys., xx, 259-309.) Wiley, New York; Chapman and 
Hall, London. 

Van der Pol, B., and Weijers, T. J., 1933. “Tchebycheff Polynomials and their Relation to 
Circular Functions, Besselfunctions and Lissajous-Figures,” Physic a, 1, No. 1, 78-96. 

Watson, G. N., 1922. A Treatise on the Theory of Bessel Functions , Cambridge University Press. 

* The values in these columns are from Ince, 1932, pp. 369 and 370. 


(Issued Separately October 25, 1946) 



The Number of the Elements 


2II 


XXIII.— The Number of the Elements. By N. Feather, Ph.D., Cavendish Laboratory, 
Cambridge. Communicated by Sir Edmund Whittaker, F.R.S. (With Three 
Text-figures.) 


(Ritchie Lecture delivered March J, 1945) 

(MS. received July 3, 1945. Read November 5, 1945) 

It is natural—perhaps almost inevitable—that I should begin with a quotation from The 
Sceptical Chymist, for it is to Robert Boyle, more than to any one man, that the change from 
ancient to modern ideas in relation to my main topic is chiefly due. The Sceptical Chymist 
was published in 1661, and by that time the systems of the Greeks and the theories of the 
alchemists had alike long outlived any usefulness which, scientifically, either had ever possessed. 
Fire, earth, air and water—or mercury, sulphur, salt and earth (it matters little which list we 
choose) were inadequate as prototypes in a world in which the scientific spirit was already 
newly alive. So at least it seemed to Robert Boyle, for in the spacious language of his day 
he wrote : 

“ Notwithstanding the subtile reasonings I have met with in the books of the Peripatetiks, 
and the pretty experiments which have been shew’d me in the Laboratories of Chymists, I 
am of so diffident and dull a Nature as to think that if neither of them can bring more 
cogent arguments to evince the truth of their assertion than are wont to be brought; a 
Man may rationally enough retain some doubts concerning the very number of those 
materiall Ingredients of mixt bodies, which some would have us call Elements and others 
Principles.” 

Yet it was more than a hundred years later before a list of the chemical elements, in any way 
recognisable as the forerunner of the lists which we employ to-day, was drawn up on the basis 
of exact experiment. It was not that the rules which Boyle had laid down for deciding for or 
against the elementary nature of substances had been entirely disregarded by those who 
followed, but that the Newtonian concept of mass had not, during the period in question, 
been accepted and understood by the majority of scientists. So it was that the phlogiston 
theory of Stahl, which was put forward during the last years of the seventeenth century, was 
not wholly unscientific, and so it was that this theory formed the basis of speculation and 
experiment during the greater part of the century which followed. But Newtonian mass was 
slowly taking its place as the basic concept of physical theory, and very gradually the gravi¬ 
metric analyses of Lavoisier (1743-94), Richter (1762-1807) and others were forcing upon 
the attention of chemists the prime importance of the notion of “combining weight” in the 
further development of their science. 

John Dalton—starting, it must be admitted, from purely physical considerations—trans¬ 
lated the idea of combining weight into that of “ atomic weight”, and from that date (1805) the 
reality of atoms was no longer merely a matter of philosophical speculation, but a theory was 
in existence, in which inconsistencies were trivial in relation to the range of success as a whole, 
whereby the relative masses of the various types of atoms were given as the result of experiment. 
From that date, too, the philosophical inquiry concerning the number of the elements came 
to be interpreted almost entirely along atomic lines. Whitehead (Science and the Modern 
World , 1926) has said: 

“In the eighteenth century every well-educated'man read Lucretius, and entertained 
ideas about atoms. But John Dalton made them efficient in the stream of science; and 
in this function of efficiency atomicity was a new idea.” 

I will not digress to enlarge upon this theme, but it is pertinent to wonder whether, in view of 
his own scant early education, John Dalton himself was ever well educated in this respect, 
whether he ever read Lucretius in the original text. 



212 


N. Feather 


I cannot afford, either, to digress on the more exact chemistry of Berzelius or Cannizzaro, or 
the theorising of Ampere and Avogadro, on the basis of which Dalton’s original atomic weights 
were amended and his ideas purified and extended in scope. I accept the chemical atom as 
a scientific reality, and pass on to the systematisers. Of them there are many; Dobereiner 
earliest of all (1817), then, when more exact data were available on which to build, Lothar 
Meyer (i860), de Chancourtois (1862), Newlands (1863) an d Mendeleeff (1869). Not all 
were hailed as prophets, but each had a contribution to make, and the last quarter of the 
century saw the Periodic Table of the Russian accepted as fundamental in any discussion 
of the number and interrelations of the elementary varieties of matter of which there was then 
actual knowledge. More than this, its intelligent use led to the discovery of new elements and 
to a revision of assigned atomic weights in other cases, but with its acceptance the contribution 
of classical chemistry to our subject was complete, and the problem of the number of the 
elements passed to the physicist. Before the periodic table could be further elucidated it was 
necessary that the nuclear model of the atom should emerge. This development did not occur 
until 1911. 

It is a commonplace to say that the nuclear model would never have emerged, with that 
claim to immediate acceptance which it possessed at the time of Rutherford’s formulation, but 
for the discoveries of X-rays and radioactivity in 1895 and 1896, and the isolation of the 
negative electron in the following year; nor is it cause for surprise that more than a dozen 
years should have elapsed between these momentous experimental discoveries and the full 
realisation of their implications in the form of a simplifying theory. Rather is it surprising 
that the advance should have progressed so far in so short a time; there was nothing very easy 
in the type of experiment required, and the incidence of outstanding genius cannot be 
guaranteed. Indeed, there were hints along the way which were missed—the work of Barkla 
(Barkla, 1906) and Kaye (Kaye, 1909) on the characteristic X-rays established a connection 
between position in Mendeldeff’s table and the exact physical measure of the penetrating 
power of the radiation from any element which was very obviously fundamental, though no 
one knew exactly how to explain it—but when the unexpected occurred in the experiments of 
Geiger and Marsden (Geiger and Marsden, 1909) on a-particle scattering Rutherford was not 
to miss the implication. It required nearly two years’ thought, but when the nuclear model 
was finally advanced (Rutherford, 1911) there was nothing for the physicist to do but to 
accept it. 

Acceptance by Bohr, as is well known, resulted in the first quantum theory of the atom as a 
whole (Bohr, 1913), acceptance by Moseley gave direction to his repetition of the work of 
Barkla and Kaye using an experimental technique of greater precision and power (Moseley, 
1913, 1914), acceptance of the nuclear model in the general field of radioactive research fixed 
upon the atom nucleus as the system involved in spontaneous radioactive disintegration, and 
the displacement rule of Russell, Fajans and Soddy placed the shorter-lived radioelements 
securely in the periodic table between uranium (atomic number 92) and thallium (atomic 
number 81). It was here that the first hint of a new possibility arose: it was frequently 
necessary to accommodate distinct radioactive types in the same cell of the table, and it was 
necessary, on the incontrovertible evidence of the chemist, to bracket, with the inactive 
elements thallium, lead and bismuth, active species of short lifetime. In the writings of 
Soddy at this period (1913-17) the idea of isotopy, which had its origin in these observations, 
was expanded and elaborated with great foresight and understanding (cf. Soddy, 1917). 
When the war of 1914-18 was over, Aston took up the matter on the basis of an earlier observa¬ 
tion of Thomson on the “parabolas” of neon (Thomson, 1913), and a more powerful instru¬ 
ment was built involving a novel principle with which an astonishingly fruitful start was made 
on the mass analysis of the elements in general (Aston, 1919). Two parameters, then, forced 
themselves on the attention of physicists for the specifying of a nucleus: its positive charge, 
expressed as a multiple (Z) of the charge on the electron, and its mass number (A), the nearest 
integer to the number expressing the mass of the corresponding neutral atom in terms of 
chemical oxygen as 16. Departures of exact masses from integral values on this scale proved 
to be small throughout the whole system of elements, and the elements themselves were 
defined and distinguished, of course, by the Z values, the “ atomic numbers” of van den Broek 
and Moseley. ■ 



The Number of the Elements 213 

Parallel with the purely experimental attack on the nuclear side, experiment and detailed 
theory were keeping pace in investigations on the electronic, or extra-nuclear side of the picture. 
With these developments I cannot deal; I can only mention the names of pioneers, whose 
work has long since been greatly extended and, in form at least, largely superseded, but to 
whom the credit belongs of first giving a coherent “ explanation’ of the periodic table in its 
original exclusively chemical aspect. Bohr, Main-Smith and Stoner developed their ideas on 
this subject in the early nineteen-twenties, and it can now be said that as a result of their work, 
and that of those who have followed them, the general interpretation of the many regularities 
with which the table abounds is solidly based in an acceptable framework of physical theory. 

I return, then, to the periodic table as viewed from the standpoint of the nuclear physicist, 
and I would say that by 1932 the survey of “existing” atom types was fairly complete. Not 
only for the well-known elements, but also for such recently discovered substances as hafnium 
(Z — 72), discovered by Coster and Hevesy (1923), and rhenium (Z = 75), identified by 
Noddack (1925), both by the method of Moseley, had satisfactory isotopic analyses been 
carried out. Though the survey of types was well-nigh complete, precision in the 
determination of exact masses still left much to be desired; in spite of this, however, a new 
accession of data was available for the systematiser to work upon. Let us see, therefore, what 
he made of it. First, there are stable species for all values of Z from 1 to 83, inclusive, with 
the exception of 43 and 61. Secondly, amongst the stable species all values of A are repre¬ 
sented except A = 5 and A = 8, in the range from 1 to 209, and many values of A are represented 
twice. * Third, when Z is odd A is odd (for Z > 7), and for these odd-numbered elements the 
number of stable isotopes is never greater than two. Lastly, when A has the same value for 
two stable nuclei (isobars) it is much more likely that Z values differ by two units than by one 
unit: “neighbouring” isobars are very rare. Obviously, significant regularities are involved 
here, though we shall not be able to make much progress in their interpretation without 
hypotheses concerning the structure of the nucleus itself. To this I shall return presently; 
meanwhile let us take a last look at the extra-nuclear structure of the atom, to dispose of the 
suggestion that one clue to an understanding of the number of the elements is to be found 
there. This is the contention of those who have believed—or who have professed to believe— 
that we can understand in a general way why atomic numbers higher than that of uranium 
(Z = 92) are not represented on the earth, without any knowledge of nuclear properties at all. 

At bottom the argument of these theorists—Sommerfeld, Narliker and Flint (1932) have 
all made suggestions of the type I have in mind—is bound up with the non-dimensional 
character of the quantity hcj 27 re% the number 137. For one reason or another they say that 
Z cannot be greater than this number, or some specified fraction of it (Flint says not greater 
than 137/2I', i.e. not greater than 97). Of the reasons advanced the one most intelligible, 
physically, is that for higher values of Z the electrons in the K shell would fail to satisfy certain 
fundamental conditions attaching to periodic motion, according to the particular unitary 
theory of gravitation and quantum phenomena favoured by the writer (Flint, 1932). Even 
though the truth of this contention were granted, it is difficult to see why such limitation should 
be decisive in respect of the types of nucleus or atom which may exist. At the worst it might 
be supposed to imply that the normal state of such hypothetical heavy atoms would be a 
doubly charged state, with the neutral atom having as transient an existence as, say, a doubly 
charged negative ion in a gas; or that its place in the periodic table would be two higher than 
would accord with its nuclear charge. But, truth to tell, I have little sympathy with these 
hypotheses in any form. 

I return, then, to theories of nuclear structure. These have taken on their present aspect 
since the discovery of the neutron (A — 1, Z = o) by Chadwick in 1932 (Chadwick, 1932). As a 
result of that discovery it became possible for the first time to think of nuclei as constituted 
entirely of heavy particles, neutrons and protons (A = 1, Z = 1), to the great simplification of the 
whole picture. By determining the “exact” masses of neutron and proton in the free state, 
and of the complex nuclei, with as high an accuracy as could be achieved, it became possible, 
too, to estimate, on the basis of this model, the mass defect (JM) or energy of binding (ZlM.^r 2 ) 
for each nucleus. Fig. 1 shows the results of these determinations. It is an important fact 
that the mass defect of the neutron-proton model increases roughly linearly with the mass 
number—‘-which is also the total number of heavy particles in the nucleus. Taken together 

P.R.S.B.—VOL. LXII, A, 1945-46, PART II IJ 



214 N. Feather 

with the facts that the numbers of nuclear protons and neutrons are not widely different for 
most stable nuclei (in all cases, except 3 He, when there is an excess of one type of particle, it 
is neutrons which are present in the greater number), and that amongst the lightest nuclei it 
is the group %e, “c, ^0* which comprises the most tightly bound systems, this result 
.leads to the conclusion that intra-nuclear forces show saturation (as the chemists understand 
the term), with the a-particle, ^He, as the quasi-saturated unit. With a structure of this type 
we should further expect that the volume occupied by the nuclear particles, in any case, would 



Fig. i.—C urve (a), neutron-proton model; curve (< h ), maximum number of a-sub-units assumed. 


be fairly closely proportional to the number of particles involved, so that, if r is the effective 
radius, A jf 3 should be approximately constant for all nuclei. There is independent evidence 
for the general truth of this result. 

We have built up a picture, therefore, which has considerable resemblance to that of a 
minute drop of liquid, except that the number of constituent particles is, relatively, very small. 
If such a picture is valid, we should be able for many purposes to represent the intra-nuclear 
forces with sufficient accuracy in terms of surface tension and intrinsic pressure, and, when 
this is necessary, take count of the fact that the nucleus is a charged body by assuming a 
uniform surface charge on the drop. A liquid-drop model of the nucleus of this type has been 

S 

* ^Be is not found in nature, but there is no serious problem raised by this fact. There is good reason to 

believe that in the ground state this nucleus would be tightly bound in relation to neutrons and protons, but just 
not stable in relation to division into two a-particlcs. 



The Number of the Elements 215 

much used since it was introduced by Gamow in 1929 (Gamow, 1929). We shall have recourse 
to its aid at a later stage in our discussion; meanwhile let us look further at the facts concerning 
those types of nucleus which are known to be stable * 

As I have already said, stable nuclei are not known for which Z = 43 and Z = 61; that is, no 
stable nucleus is known containing 43 or 61 protons in its structure. Similarly, a survey of 
existing types shows that the following values of (A-Z) are not represented, viz. 

*9> 2 L 35 , 39 , 45 , 61, (65), (71), 89, hi, 115, 123 f 

or nuclei containing these numbers of neutrons are unknown amongst the existing elements 
(our survey covers the range Z < 84, A < 210). In the first place all these “missing’ 1 * numbers 
are odd, which is in line with our conclusion regarding the tendency towards saturation of the 
forces (saturation being achieved with groups of two protons and two neutrons in every case), 
and in the second place “missing” (odd) numbers of neutrons are more in evidence than 
“missing” (odd) numbers of protons. This latter result appears to indicate a definite 
asymmetry, not connected with the purely electrostatic forces between the protons in the 
nucleus; in another aspect it is a reflection of the fact that stable nuclei (with Z > 7) are 
unknown having Z odd and A even, and that, when Z is even, odd values of A are less likely 
than even values. 

The survey which we have just made has served to establish certain empirical results 
which may be systematised and understood, after a fashion, but it does not answer the natural 
question, “What are the properties of the 4 missing 5 nuclei, and in particular of those nuclei 
of which the non-occurrence upon the earth interrupts the sequence of isotopes of the different 
elements?” In the latter category are the “missing” even isotopes {i.e. even A) inter¬ 
mediate in mass between the two odd isotopes which many of the elements of odd Z possess, 
and many odd isotopes belonging to elements of even Z. The answer to this question has, 
within the last ten years, been completely given by experiment. Previously, it was thought 
sufficient to say that since elements with Z > 83 are all spontaneously radioactive, so with the 
lapse of time they must all disappear entirely from the earth; now we can add that we know, 
as the result of direct experiment, that the “missing” nuclei of smaller mass are, almost 
without exception, also radioactive, but with shorter periods than have uranium or thorium. 
Thus it is inevitable that these nuclei are “missing” on the earth to-day, whatever may have 
been the early history of our planet. Fig. 2 gives some idea of the growth—and present near 
completeness—of our knowledge in this respect, so far as concerns a small range of values of Z. 
It need only be added that information of like extent is available for all the elements; the 
discovery by Curie and Joliot in 1934 (Curie and Joliot, 1934) that an “ artificially radioactive ” 
phosphorus could be obtained by bombarding aluminium with a-particles has rapidly been 
followed by the discovery and study of some hundreds of similar radioelements formed by 
irradiating ordinary materials with a-particles, neutrons, protons, deuterons, electrons and 
high-energy X-radiation. Chemical and other tests have served to identify the new products 
as to A and Z, and a study of the energy of the radiations which they have been found to emit 
has given us an accurate index of the extent to which the previously “missing” nuclei are in 
fact unstable. Let us look at this last point in more detail, first remarking upon the nature 
of the disintegration processes involved. 

With the “classical” radioelements, the descendants of uranium and thorium, two types 
of radioactive change were early identified: disintegration with the emission of a-particles 
(helium nuclei) on the one hand, and the emission of / 3 -particles (negative electrons) on the 
other. With the “artificially” produced radioelements a-particle emission has not so far 
been discovered,} but j8-emission of two types has been found. Broadly speaking, when species 
are produced in which the number of nuclear neutrons is in excess of what is normal for a 
stable nucleus, negative electrons are emitted; when protons are in relative excess positive 
electrons are given out (or the nucleus captures one of the extra-nuclear electrons, in a process 

* It is not possible to be completely certain of the absolute stability of the heavier nuclei; as an empirical 
rule we can regard a nucleus as stable if the half-value period for any spontaneous disintegration to which it may 
be liable is greater than io u years. . . . 

f Brackets refer to doubtful cases, the doubt being whether both members of certain pairs of neighbouring 
isobars are in fact stable according to our definition. 

t Except with element 85 (v. infra). 



216 


N. Feather 


which is generally referred to as K-electron capture *). In fig. 2 species of the former type 
are represented by open circles in general above and to the left, and those of the latter type 
by open circles generally below and to the right, of the straggled array of full circles which 
represent the stable nuclei. This statement is indefinite chiefly as concerns certain unstable 
species which are isobaric with two stable nuclei for which AZ -2 (non-neighbouring stable 
isobars— v. sup). When the nuclear charge of such an unstable nucleus is intermediate 
between the nuclear charges of the stable isobars it would appear that either positive or negative 
electron emission might occur. Sometimes such a radioelement is found to exhibit the 
phenomenon of radioactive branching (positron-negatron branching) as this conclusion would 
suggest, sometimes only one mode is evident—to the accuracy with which present investigations 


45 


40 

A-Z 


35 


o o 


30 

© 

• o 


o o 

i • 




250 

<E> 

20 # 

20 


<B> # 

o o 
o 


© o 
o 
o 
o 


o © 


# # 


o 

© 

o 


25 


o 


o # 



• 00 

• o 

o 


o 

o o 

# 

o # 

# • 

o o 
<8> 
o 

<g> 

o 


o o 


30 z 


Fig. 2. 


have gone. (Examples of the former type are indicated in fig. 2 by oblique arrows pointing in 
opposite directions, those of the latter by a single arrow, in one direction or the other.) Finally, 
a survey of artificially” produced radioelements has greatly increased the number of known 
examples of nuclei having long-lived excited states (metastable states). A nucleus in such a 
state is radioactively quite distinct from the same nucleus in its lowest energy (ground) state. 

efore 1934 the only recognised instance of nuclear isomerism of this type was provided by the 
pair of bodies uranium X 2 and uranium Z (for both of which A = 234, Z » 91); already, in the 
range of Z covered by fig. 2, for example, the seven species, represented by the double open 
circles m the diagram, have been found to show this effect. 

Returning now to a survey of the amounts of energy liberated in the various disintegrations, 
ar . e . in a correlate this information as to the degree of instability of unstable 

nuc ei in general with what has already been said concerning the regularities exhibited by the 
an Z values of the stable species. If we take the whole of the information available con¬ 
cerning negative-electron-active bodies (and parallel statements are almost certainly valid for 
e positron-active bodies, also, though the experimental information is at present less complete 
tor them) we find that this clear regularity emerges. When a species having A and Z both 


iMeV^norf fnwmr frTw 6 * S ^ avoure< ^ * n relation to positron emission, by there being available about 

is it the more nroShlA process than for the latter in all cases, but nqt in every case, by any means, 

is it tne more probable mode of disintegration in actual fact. 



The Number of the Elements 217 

even gives rise by ^-disintegration to a daughter product which is itself j8-active, then the 
disintegration energy of the first change (Z even -> Z odd) is in general less than the dis¬ 
integration energy of the second (Z odd —► Z even), whereas, when there are two consecutive 
jS-active bodies the former of which has A odd and Z even, then the reverse is the case and the 
disintegration energy is greater for the first body than for the second. We can obviously 
extend this empirical rule by including cases in which the disintegration energy is negative 
and so introduce stable species into our considerations. Then our rule requires that, if we 
have neighbouring isobars of even mass number, one stable and the other unstable (negative- 
electron-active), it is the species of even charge number which must be stable (unless we admit 
of the occurrence of neighbouring stable isobars for this value of A). And, in the alternative 
case, the rule allows statements under two heads. First, if we have neighbouring isobars of 
odd mass number of which the species of smaller charge number is stable and that of greater 
charge number is negative-electron-active, then the charge number of the stable species is odd. 
Secondly, if for such neighbouring isobars the negative-electron-active species is the one of 
smaller Z and the stable one that of greater Z, then the stable species in question can either 
have Z even or odd. These three deductions from the extended rule obviously correspond 
exactly with our former statements that there are in general no stable species with Z odd and 
A even, whereas stable species with Z odd and A odd as also with Z even and A both even 
and odd may occur. 

We have come full circle, therefore, and linked up our attempts to systematise for the 
stable and unstable species for which Z < 84; and others, as for example Fuchs (Fuchs, 1939), 
have gone somewhat further, interpreting (or describing) the regularities to which I have 
drawn attention in more formal terms. I cannot burden you now with such detailed con¬ 
siderations; however, for Z < 84, it is clear, I think, that the problem of the number of the 
elements—or of the number of stable nuclear species—turns essentially on the question of 
stability towards ^-disintegration.* My last remarks, then, deal with the range Z > 83. 

As I have already indicated, the customary assumption here is that there are no stable 
species of higher Z because of the onset of a-instability. Disintegration with the emission of 
a-particles decreases Z by two units; thus the problem of the stability of the heaviest elements 
has come to be regarded as primarily a question of the possibility or otherwise of a-emission. 
But this “ explanation ” is not, and has never been, very satisfactory, chiefly because there is 
a very clear general tendency for the energy of a-disintegration, that is for the degree of 
instability, to decrease as Z increases amongst the classical radioelements. In fact, if this 
were not so, radioactive series (of heavy elements) would not be found in the world. There 
would be no gap of 8 or 10 units in Z between the stable end-product of a-transformation and 
the “nearest” active species to possess a lifetime comparable with the age of the solid earth. 
No, it is clear that we must search for another explanation. This was provided by the dis¬ 
covery of neutron-induced fission in 1939. In that year it was found by Meitner and Frisch 
(Meitner and Frisch, 1939; Frisch, 1939)—and the radiochemical work of Hahn and Strass- 
mann supported these findings (Hahn and Strassmann, 1939)—that capture of a neutron by 
a uranium (or thorium) nucleus could lead to the division of the system into two parts of 
comparable mass and charge. These “ fission fragments ” were unstable nuclei of intermediate 
mass, which after a succession of ^-transformations became stable nuclei belonging to the 
known isotopes of the mid-table elements. Evidence for a great many modes of division was 
found, ^-active fission products being discovered, in one type of fission or another, with almost 
all values of Z from 35 (bromine) to 58 (cerium). Then, in the following year, Petrzhak and 
Flerov (Petrzhak and Flerov, 1940) showed that, in the natural state, uranium undergoes fission 
spontaneously: for every million atoms of ordinary uranium which disintegrate in the “ normal ” 
way, with the emission of a-particles, roughly one divides into two fragments, according to one 
or other mode of spontaneous fission—and in that sense a relatively large number of additional 
radioactive series (series of successive jS-active bodies) must be added to the three series of 

* One species, it is true, in this range of Z is cz-active and of long lifetime ^ but no essential failure in 

our ideas is involved in this fact. Four other long-lived species ^Rb, I ^ I Bu and may be 

regarded as relics of “artificially active” bodies, of more usual type, left over from the primal mixture of 
elements of which the earth's substance was formed at an early epoch in cosmic evolution. 



2l8 


N. Feather 


a 

u classical” radioelements which are found on the earth. Now the point about spontaneous 
fission is just this: in relation to fission a uranium nucleus is already, energetically, in a very 
favourable state—with anything up to i7oMeV of energy available for the process. But the 
nucleus does not break up “ instantaneously ” into two charged parts just because energy is 
available: as in a-disintegration it disintegrates according to the “laws of chance”, with the 
probability of disintegration (reckoned per unit time) depending very markedly upon the 
difference between the available energy and an energy characteristic of the original nucleus and 
the fragments concerned. On a simple liquid-drop model (with constant density and “surface 
tension”) this difference is likely to be a function of Z 2 /A, making spontaneous fission increase 
extremely rapidly in probability as Z 2 /A increases. No doubt the model is much too crude for 

8 

6 


4 

Log to X 

2 

( X: sec. 1 ) 

o 


-2 


-4 

-6 

-8 


-10 

5 6 7 8 9 

E* (McV.) 

Fig. 3. 

detailed calculation, but it is clear enough that for Z a few units greater than 92 spontaneous 
fission must be extremely rapid—and eventually “instantaneous”, even for nuclei in the 
ground state * When that stage is reached it is unquestionably true to say that nuclei of 
higher charge cannot possibly exist. It is only necessary to suppose, therefore, that for the 
immediate range of Z just greater than 92 no long-lived a-bodies occur, to understand why 
uranium is the last terrestrially known element of the table: beyond that range spontaneous 
fission will take care of the rest. 

Having discussed the ultimate limitation to the number of the elements in the true sense 
(limitation as to Z), it is of interest to inquire finally into the limitations in respect of A in the 
domain of the heavy radioelements. This can be done most simply by examining the way in 
which a-disintegration energy E (or decay constant A) varies with mass number for a given Z. 
For this purpose the case of Z = 84 provides the most clear-cut information. In fig. 3 the 
Geiger-Nuttall curve for the a-active isotopes of this particular charge number is plotted in 
the form log 10 A against E. Three radioelements of the radium series (RaA, RaC' and RaF) 

i* * *^ke “instantaneous” neutron-induced fission of uranium is, of course, fission of a compound nucleus in a 
highly excited state. 






219 


The Number of the Elements 

and two each of the thorium and actinium series (ThA, ThC', and AcA and AcC') are repre¬ 
sented, and consideration of the experimental data is simplified by the fact that so far as is 
known the a-disintegration of each body takes place according to a single mode only (and 
change of nuclear spin in the process is, in each case, most probably zero). The points lie on a 
good smooth curve, but the important observation is that whereas, as A decreases from 218 to 
212 (no information is available for A = 217 and A = 213) the disintegration energy (or 
a-instability) progressively increases, the disintegration energy decreases again even more 
rapidly as A decreases further through 211 to 210 (polonium).* The curve for Z = 83 shows 
similar features ,E increases as A decreases from 214 (RaC) through 212 (ThC) to 211 (AcC), 
but 210 (RaE) is / 3 -active and 209 (ordinary bismuth) is stable. The parallel would be even 
closer if it were found that radium E is a-active in a rare mode (all the other a-active species 
having this value of Z show a//? branching), but it is striking enough as it is. 

Until recently no radioelements with Z = 85 or Z = 87 were known, but within the last few 
years three a-active isotopes of the former element and one / 3 -active species with the latter 
charge number have been reported in the literature.! Information concerning the active 
isotopes with Z = 85 is sufficient once more to show the feature we have been discussing, in 

outline at least v the a-disintegration energy for is less than, and that for ( 2 g^) greater 

than, the a-disintegration energy for For values of Z greater than 85, however, the 

a-active species of which we have knowledge all belong to mass-sequences for which E increases 
as A decreases (for Z constant), but it would seem reasonable to suppose that, in this range of Z 
also, the class of a-active species is limited, in the direction of A increasing, by the fact that 
negative-electron-emission becomes more probable than a-emission—and, in the direction of A 
decreasing, by the rapid decrease of the probability of a-emission in the limit in favour of 
K-electron capture or the emission of positrons. Outside certain limits of A, for a given Z, 
the energy available for a-disintegration would thus appear to be too small for the process to 
be an important disintegration mode—and within these limits the disintegration energy would 
appear in each case to vary smoothly with A, but unsymmetrically, with E greatest for a value 
of A near the lower limit of the a-range. 

Summary 

The development of the idea of the chemical element is traced from its early beginnings, 
and the importance for this development of the Newtonian concept of invariable mass is 
emphasised. The emergence of the nuclear atom model is outlined, and the discovery of the 
complex (isotopic) nature of the majority of known chemical elements is described. Nuclear 
charge (Z) and mass (A) numbers are defined. Previously recognised regularities concerning 
mass and charge numbers of existing stable species are shown to have exact counterparts 
in regularities relating to the degree of instability (as measured by the energy of disintegration) 
of /?-active species (“naturally” and “artificially” radioactive species). Naturally occurring 
a-active species are regarded as the analogues of the stable species for charge numbers greater 
than 83, and for charge numbers both greater and less than this value the limitation to the 
number of stable or quasi-stable isotopes of a given element (limitation of A values for a given Z) 
is established as essentially a question of nuclear stability as against / 3 -emission (positive and 
negative electron emission). Finally, reasons are given for supposing that the number of 
possible chemical elements is limited (limitation to Z in the direction of Z increasing) by the 
susceptibility to spontaneous nuclear fission of species of sufficiently high nuclear charge. 

* Since the lecture was delivered I have discovered that the device of plotting separate Geiger-Nuttall 
curves for each value of Z has been adopted by Berthelot (Berthelot, 1942), and that the polonium anomaly 
discussed above has been remarked on by him. /2ii\ 

■f and ( 2 gj) formed in rare modes of ^-disintegration of RaA and ThA respectively, ( g 5 ) by an (a, 2*) 

reaction on 2 ^Bi, and ( 2 | 3 ) in a hitherto unnoticed a-mode with Ac ( 2 |p- 



220 


The Number of the Elements 


REFERENCES TO LITERATURE 

For the earlier history see articles “Atom” (Neville, F. H.), “Element” (Ostwald, W.) and 

“Matter” (Thomson, J. J.), Encyclopedia Britannica (nth edition, 1910-11), II, 870, IX, 253 and 

XVII, 891. 

ASTON, F. W., 1919. “A positive-ray spectrograph,” Phil. Mag., XXXVIII, 707. 

BARKLA, C. G., 1906. “Secondary Rontgen radiation,” Phil. Mag., XI, 812. 

Rerthelot, A., 1942. “Energies et periodes des disintegrations a, I et II,” Journ. Physique , III, 17 
and 52. 

BOHR, N. 1913. “On the constitution of atoms and molecules, I, II and III,” Phil. Mag., XXVI, 1, 
476 and 857. 

Chadwick, J., 1932. “Possible existence of a neutron,” Nature, cxxix, 312. 

Curie, I., and JOLIOT, F., 1934. “Un nouveau type de radioactiviti,” Comptes rendus Acad . Sci., 
Paris, CXCVIII, 254. 

Flint, H. T., 1932. “The uncertainty principle in modern physics,” Nature, CXXIX, 746. 

Frisch, O* R., 1939. “Physical evidence for the division of heavy nuclei under neutron bombard¬ 
ment,” Nature , CXLIII, 276. 

FUCHS, K., 1939. “On the stability of nuclei against /S-emission,” Proc. Camb. Phil. Soc ., XXXV, 242. 

Gamow, G., 1929. “Discussion on the structure of atomic nuclei” (p. 386), Proc. Roy. Soc., A, 
CXXIII, 373. 

Geiger, H., and Marsden, E., 1909. “On a diffuse reflection of the a-particles,” Proc. Roy. Soc., 
A, lxxxii, 495. 

Hahn, O., and Strassmann, F., 1939. “fiber den Nachweis und das Verhalten der bei der 
Bestrahlung des Urans mittels neutronen entstehenden Erdalkalimetaille,” Naturwiss., XXVII, 11. 

Kaye, G. W. C., 1909. “The emission and transmission of Rontgen rays,” Phil. Trans. Roy. Soc., 
ccix, 123. 

Meitner, L., and FRISCH, O. R,, 1939, “Disintegration of uranium by neutrons: a new type of 
nuclear reaction,” Nature, CXLIII, 239. 

Moseley, H. G. J., 1913. “The high frequency spectra of the elements, I,” Phil. Mag., XXVI, 1024. 

-, 1914. “The high frequency spectra of the elements, II,” Phil. Mag., XXVII, 703. 

Petrzhak, I. S., and Flerov, G. N., 1940. “Spontaneous fission of uranium,” Journ * Physics, 
U.S.S.R,, III, 275. 

Rutherford, E., 1911. “The scattering of a- and ^-particles by matter and the structure of the 
atom,” Phil Mag., XXI, 669. 

Soddy, F., 1917. “Complexity of the chemical elements,” Proc . Roy. Inst., XXII, 117. 

Thomson, J. J., 1913. “Rays of positive electricity” (Bakerian Lecture), Proc. Roy. Soc., 
A, lxxxix, 1. 


{Issued separately October 25. 1946) 



Time-Scales in Relativity 


221 


XXIV.— Time-Scales in Relativity. By A. G. Walker, M.A., D.Sc., Department 
of Pure Mathematics, University of Liverpool. Communicated by Sir Edmund 
Whittaker, F.R.S. 

(MS. received May 7, 1945. Read November 5, 1945. Revised August 26, 1946) 

i. Introduction. —One of the most important of Milne’s discoveries is undoubtedly the 
significance of time-scale regraduations and, in particular, the relation between atomic 
/-time and gravitational r-time. Although in Milne’s work /-time is more fundamental than 
r-time, this relationship is not inevitable, as was shown in an axiomatic development of 
cosmology given recently by the author.f There the T-scale was the more fundamental, 
and it was not found necessary to introduce Milne’s /-scale. The object of the present paper 
is to discuss this primitive T-scale still further, and to show how the /-scale may be introduced 
by means of an axiom. The unpublished work mentioned above is not required for this 
purpose because the cosmological models considered here were described in earlier papers 
(Walker, 1937, 1940 b). We also examine the various constants, absolute and conventional, 
which are connected with the different scales of time and length, and with different models. 

We refer to u axioms” rather than “hypotheses” in this work in order to stress the fact 
that we are not examining the external world but are constructing model universes. An 
hypothesis enters later when we assume that one of these models approximates to the external 
world. This leads us to consider certain astronomical observables (red-shift, distance, 
distribution, etc.) and the theoretical relations between them. It is hoped that these relations, 
when compared with actual observations, may eventually test our cosmological theory and 
determine those world-constants which are at present unknown. 

2. Absolute r*-time. —It has been established that clocks attached to fundamental 
particles can be so graduated that they are congruent in a well-defined sense and that the 
distancebetween any two particles is constant in time. This distance is by definition ic(r 2 - r^, 
where c is constant and t x , t 2 are instants of emission and reception at one particle of a light 
signal reflected at the other. These clocks are constructed from ideal experiments, i.e. 
experiments which are definable although not practicable, and are determinate except for 
an arbitrary world-wide change of zero and unit (t' = ar +b). When the unit is changed 
(t' = ar), all distances are affected proportionately. 

We also know that, for given r-clocks and for a given value of c, each fundamental 
particle corresponds to a point of a 3-dimensional Riemannian space, S 3 , in such a way that 
certain linear sets of particles correspond to geodesics, the distance between any two particles 
is given by the geodesic arc between the corresponding points, and angles are given correctly 
by the Riemannian formula. S 3 is a space of constant curvature, say Ko, and convenient 
co-ordinates r, 9 , <f> can be defined so that any given particle O is at r = o and such that the 
metric of X 3 is 

dr* + r 2 d 9 2 + r 2 sin 2 6«> r .. 

* -— w 

All light paths are null geodesics in the 4-space with metric 

dcj 2 = dr 2 — c~~ 2 d^. ' ( 2 ) 

Suppose now that a clock as described above is given, that the value of c is given, and that 
K 0 is the curvature of the appropriate S 3 . Let particles O, P, Q be such that OP = OQ=/ 

and angle POQ — 6 for given l and 0 , and define A=f° r fixed £ Then A as a function 
of /, say A =/(/), is determinate from ideal experiments, i.e. the form of/is observable. 

f To appear shortly in Proc. Roy . Soc. Edin. 



222 A. G. Walker 

Calculating / and A in terms of r we find from (x) 

f dr r 

/== J 0 i+£K 0 ^’ iTpV' 

Hence f is calculated to be 

K 0 >o, /(/) = sin (V K o)/\/ K o; 

K 0 =o, /(/)=/; _, (3) 

Ko<o, /(/) = sinh (/V - K 0 )/V - K 0 . 

Since /(/) is observable and is one of these alternatives, it follows at once that the value of 
K 0 can be deduced. Hence: 

For a given r-clock and value of c , K 0 is determinate. 

The cases K 0 = o and K 0 ^o must now be treated separately, and from now until § 7 
we shall consider only K 0 #o. 

We shall now prove that when K 0 #o there exists an absolute unit of r-time. First 
choose c~ 1. Then K 0 depends upon the unit of r-time, and with a change of unit given 
by r = ar, all distances including / and A are multiplied by a , so that from (3), K 0 is replaced 
by K 0 /<z 2 . Choosing a=\ K 0 |&, then the new value of K 0 is either 1 or - x, and the new 
unit of, time which gives K 0 this value is unique. Hence we have: 

When K 0 ^o, there is a scale of r-time which is unique except for arbitrary zero . 

With this scale , K 0 = ± 1 when c—i. 

This will be called the absolute scale and written r*. The corresponding value of K 0 when 
c = 1 will be written k , so that k — 1 or - 1. The value k = 0 will also be included later to 
cover the case K 0 = o. 

Since r # is an absolute scale of time, it follows that an absolute measure of distance is given 
by the formula l* = |(t 2 * - t x # ), and that the appropriate 3-spacc as described above has 
the metric (1) with K 0 = ^. These absolute measures of time and distance intervals can now 
be used in other ideal experiments. 

3. Conventional Scales .— A complete study of our models can now be carried out using 
the absolute scales of time and distance defined in the last section. When, however, we come 
to suppose that one of the models approximates to the external world, we must recognize 
that the units of these absolute scales are probably not the units of time and distance (seconds, 
centimetres, etc.) used in physics to describe this world. It is necessary, therefore, to introduce 
conventional units , which are supposed to agree with physical units at the present instant. 

Conventional units can be determined from ideal experiments in which we are supposed 
to possess a clock reading absolute T*-time and another reading the time of physics, e.g. 
seconds or years, and also a scale for measuring absolute lengths and another for measuring 
the lengths of physics, e.g. centimetres or miles. Suppose we find that a unit interval of 
absolute time is r 0 units of conventional time, and that a unit of absolute length is cr 0 units 
of conventional length. Then r 0 and c are observables, and the absolute and conventional 
scales are connected by the transformations 

T = T 0 T # , l~CT 0 l*. (4) 

In terms of light-signals, /* is defined by i(r 2 ^-rf) } so that from (4), l^lc(r 2 -r x ). 
This gives the relation between conventional distances and time intervals. 

The above comparisons for the determination of t 0 and c are carried out at the present 
instant, but we can imagine similar experiments being carried out at any other instant. 
We cannot, however, say anything about their results unless we adopt some rules for the 
construction of conventional scales at all instants. Certain rules are physically obvious, 
such as those using a periodic atomic or gravitational phenomenon to define a unit of time, 
and a crystal lattice or the wave-length of a particular Fraunhofer line to define a unit of 
length. Some of these will be considered later in connection with atomic time-scales, but 
others are too complex to be considered in the present discussion. One rule of particular 
interest and simplicity gives what will be called the conventional r-clock . This clock reads 



Time-Scales in Relativity 223 

conventional time at the present instant and is constructed so that r 0 is constant. Also, 
conventional lengths at all instants are defined so that c is constant , having the value deter¬ 
mined at the present instant. It then follows that the relations (4) hold at all instants. 
From the form taken by dynamical equations when these scales are used, it appears that the 
above definition of conventional T-time is equivalent to that in which periodic gravitational 
phenomena are used to provide a clock. 

A fact that must be constantly remembered is that all experiments, actual or ideal, are 
carried out by ourselves, i.e. at one particular particle, O. The above r-clock therefore 
belongs to O, and the conventional lengths are measured from O. We can now, however, 
define conventional T-clocks and lengths for any other particle by means of the absolute 
r # -clock for that particle and the relations 

T = T 0 T*, /=k(r 2 -T 1 ), 

where r 0 and c have the values determined at O. From this it follows that all conventional 
measures can be represented, as described in § 2, in the 3-space with metric (1), where 

Ko-*/<*r 0 *. ( 5 ) 

Constants such as t 0 and c, which are not essential to the model and appear only when 
the model is compared with the external world, will be called conventional constants . 

4. The Dimensional Axiom .—In r-time, the corresponding distances between funda¬ 
mental particles are constant, and each model has a static appearance. An axiom which 
is fruitful and gives sensible results is the dimensional axiom , which states that there is no 
absolute zero of r*-iime , and hence of r-time. This implies that a model is static, at least 
statistically, and that any non-static phenomenon is repeated and distributed in such a way 
that the state of the whole is the same at all instants. 

This axiom will be applied to atomic clocks in §§ 5, 6. Another application will be 
made when we come to consider observations on distant extra-galactic nebulae, for some 
assumptions must be made about these nebulae over the long intervals of time which their 
light takes to reach us. Our assumption is that we observe what may be called a mean nebula , 
i.e. one which is in all ways the mean of all nebulas in its neighbourhood. From the Cosmo¬ 
logical Principle, or its equivalent, it follows that one mean nebula is similar to another, 
and from the dimensional axiom we see that the characteristics (size, intrinsic brightness , 
etc.) of a mea?i nebula are constant in r-time . 

5. t-time .—A /-scale of time is here defined as one in which the fundamental particles 
have uniform relative motion, distances being measured in the usual way in terms of light 
signals. The relation which must exist between a scale of this kind and a T*-scale has been 
well established by Milne and Whitrow, and is of the form t—ae 7 *!*, where a and k are positive 
constants. Since there is no absolute zero of r*-time, it follows that there is no absolute 
unit of t-time, and the value of a is not significant. The essence of the above transformation 
is therefore given by the relations 

dt dr* . /=o when r*= 

— =—, _ * (0) 

/ k t-o 0 when = 

The zero of /-time is fixed and may be called the instant of creation; this will be written T c . 

For a given k , a /-clock is determined except for changes of unit. It is accelerating in 
r*-time and there is an instant, T, at which the t*- and /-clocks are running at the same rate, 
i.e. dt—dr *. From (6), T is given by /=/c, so that k is the /-measure of the interval T C T. 
When k remains fixed but the unit of /-time is changed, then T varies in such a way that 
T C T is still k in the new unit. This demonstrates that k is a pure number. 

For each k there is a unique conventional i-clock , this being by definition a /-clock which 
runs at the same rate as the conventional clock of physics at the present instant T 0 . From 
(4) and (6) we see that the interval T C T 0 as measured by this clock is 

/o = KTo, ( 7 ) 

this, then, being the “ present age of the universe ” in the conventional /-time given by re. 



224 


A. G. Walker 


Since the zero of r-time is at our disposal, it is sometimes convenient to take it to be at 
the present instant; the relation between the two conventional time-scales is then 


i~t 0 e T l to . 


( 8 ) 


Milne’s transformation from / to r differs from this only in that he sets the T-clock to read 
/ 0 at the present instant. 

When the value of k is changed to /c', then there is a new conventional scale, say /', and 
from (7) and (8) it can be seen that the relation between t and /' is 

2f'=y^, /3 = /c//c', 

To find a convenient geometrical representation to associate with /-time, consider (1) and 
(2) with r*~time and c = i, so that K 0 = k. Transforming from (6) and writing r^icp, we see 
that the 4-space with metric da 2 is conformal to the space S 4 with metric ds 2 , where 


and 


ds 2 -dt 2 -t 2 de 2 > 


dp 2 +p 2 dd 2 +p 2 sin 2 6 d<j) 2 
de = (1+JKp 2 ) 2 1 


( 9 ) 


K=/$k 2 . 


(10) 


Since null geodesics correspond to null geodesics in conformal spaces, it follows that the light 
paths are null geodesics in S 4 . 

This space S 4 , and the space S 3 with metric de 2 , have frequently been used by the writer 
as maps suitable to /-time, and their properties have been described elsewhere (see, for 
example, Walker, 1940 £). In previous work, K appeared as an arbitrary constant, dis¬ 
tinguishing between different models, and this, we shall see, is still the case. We now find, 
however, that only the sign of K is significant in r-time, and that the value of | K | depends 
directly upon the choice of /-scale. The fact that k, and hence K , is unique for a particular 
model depends upon the atomic axiom , by means of which we can construct an ideal experiment 
for the determination of k. 

6. The Atomic Axiom .—The interpretation of red-shift in light received from a distance 
is closely connected with the problem of time keeping, and the phenomenon is reproduced 
in Milne’s world-model by way of a hypothesis concerning atomic time. This, in general 
terms, states that for an oscillatory phenomenon of an atomic character, the periods are 
constant in some /-time. For example, the frequency of a recognizable Fraunhofer line is 
supposed constant in some /-time. A phenomenon of this kind gives rise to what Milne calls 
an atomic clock. 

For our purpose the corresponding axiom can be stated thus: 

Axiom .—At each fundamental particle there occur periodic phenomena, which shall 
be described as “atomic”, such that (a) the frequency of each is constant in some /-time, 
(d) all frequencies in some interval occur at the same (any) instant other than T c , and (c) 
atomic clocks with the same frequency at any one instant keep the same /-time. 

It follows froxn (a) that a particular value of k is associated with each atomic clock. To 
determine this value for a particular atomic clock, set the r # -clock to read zero at the present 
instant T 0 and measure the frequency v* of the given oscillation in r*-time continuously 
for an interval of time. Then from (6) and (a) above, we get a functional relation of the form 


v # = ae T */*, (11) 

and since this is observable, it follows that k and a are determinate. If the experiment is 
repeated at another instant, then the zero but not the unit of r # alters and from (x 1) we see 
that a would be affected but not k. Hence, k is a determinate pure constant for the given 
atomic clock, but a depends upon the instant at which the experiment is performed. 

The atomic axiom does not assert that all atomic clocks keep the same /-time, i.e. that they 
all give rise to the same value of /c. This important property does, however, follow from the 
dimensional axiom. To prove this, consider all atomic clocks at the present instant. Then 
from (11), a=v # for each, and from ( 3 ) of the axiom we have clocks giving all values of 
a in some interval. Also, from (c), there is just one k for each v m and therefore each a, and 
there exists, therefore, an observable relation k = /(cl) at the present instant. 



Time-Scales in Relativity 225 

The experiments leading to this relation can be repeated at other instants, but by the 
dimensional axiom the form of the function so obtained must be independent of the instant 
at which the experiment is performed. To obtain the form of/ after an interval 6 of r # -time, 
we see from (11) that the new form of this equation gives a' in place of a where a = ae 91 *. 
Hence from /c=/(a) we deduce that the new functional relation is given implicitly by 
#e=/( ae~ e J K ), whence, from the dimensional axiom, 

f(a)=f{ae-*™} 

for all 6 and all a in some interval. 

For given a and any number x>o, choose 0 = -/(a) log (pc/ a). Then we must have 
/(^)=/(a) for fixed a and variable x, and the function / is therefore a constant. Thus the 
same value of k emerges from all the experiments, i.e. we have proved that: 

All atomic clocks keep the same t-time , and k is a constant of the modeL 

The latter part of this statement follows partly from what we have proved and partly from the 
Cosmological Principle or its equivalent. We now see from (10) that: 

The constant K is determinate for a particular models and that models in which the 
respective values of K differ are essentially different. 

7. k-o. —This case differs from the others in that there is no absolute r # -scale and no 
constant t 0 . There is still conventional r-time and distance related by a conventional 
constant c , and the dimensional axiom still applies, denying the existence of an absolute 
zero of r-time. The appropriate 3-space for geometrical representation is given by 
(1) with K 0 = k — o. 

While t-time is still related exponentially to r-time, there is not now a unique k associated 
with a given /-clock. Conventional /-time exists as before and is related to conventional 
r-time by (8), / 0 now being a constant having any positive value; a conventional /-scale 
exists for each 4 , which is the age of the universe in this scale. The argument which led to 
the constancy of k in § 6 can be applied with conventional t in place of t* and / 0 in place of 
k ; the dimensional axiom now leads to the fact that / 0 has the same value for all atomic 
clocks which read conventional time at the present instant, / 0 thus being a determinate 
conventional constant as before. 

8. To sum up the results so far, we have proved the existence of an observable model 
constant K, which can take any value f including zero, and which discriminates between 
different models. When K^o, there is a r # -clock which has a unique unit but an arbitrary 
zero, and a /-clock which has a unique zero but an arbitrary unit. The relation between 
these two clocks involves | K |, and the existence of the unique /-clock and the significance 
of | K | depend upon both the dimensional and the atomic axioms. When a model is compared 
with the external world, there arise conventional t- and /-clocks and conventional constants 
t 0 , c , / 0 . When K = o, there are no absolute clocks, but there are conventional clocks as 
before and conventional constants c 7 t 0 . 

Comparing results with those given by Milne, it appears that the world-model constructed 
by Milne is similar to our model in which K= -1, i.e. k= -1 and /c=*i. This model is 
interesting mathematically because the space S 4 with metric (9) is flat when and only when 
K ~ - 1, and co-ordinates can then be defined in relation to each particle such that the trans¬ 
formation from one particle to another is Lorentz. This property implies the uniqueness 
of the /-clock (except for unit), a fact which appears to form a major part of the arguments 
for a unique /-scale given by Milne and Whitrow. Such a derivation of the /-scale requires, 
however, the prior establishment or adoption of the Lorentz transformations (or of K = -1 
in our notation), and it is not clear, at least to the author, what physical argument or assumption 
produces these transformations 4 

f The inclusion of the case K > o (i.e. k-i) depends upon the form of the early light-signal axioms. In 
this case, a light signal emitted at one particle is received infinitely often at a second particle, contrary to an 
axiom which is regarded as desirable by Milne and by the writer. 

J The existence of a unique zf-scale, i.e. the exclusion of regraduations of the form /'=/ a , a ^ i,*is funda¬ 
mental in Milne’s work, but it is difficult to trace its precise derivation. It is not given in Milne and Whitrow, 
1938, where only the general r-scales, and hence /-scales, are derived, and although it emerges from the 
derivation of metric in Milne, 1940, there is, on p. 68, a supposition equivalent, in the author’s opinion, to the 
assumption K= - 1, In other arguments, the Lorentz transformations are assumed outright. 



22 5 - 4 . < 7 . Walker 

It is interesting to note that Milne regards /-time as more fundamental than r-time, and 
refers to T-time as “temporary . . . with its ephemeral constant t 0 5 (Milne, 1943? P* * 7 )* 
This differs from the present view, for we have shown that r-time is definable before /-time, 
that its definition does not require 4 and that the uniqueness of /-time requires axioms which 
are not involved in the definition of r-time. Also, / 0 is seen to be on the same logical footing 
as c , both being observable conventional constants. 

9. Actual Observables.— Assuming now that one of our models approximates to the 
external world, a problem of obvious importance is the determination of the constants k, k, 
r o» ^0 which have arisen, i.e. of K, c, and / 0 since knowledge of these is sufficient. It is now 
necessary to construct actual instead of ideal experiments; one, for the determination of c , 
is already well known, so we may assume that c is known and that there remain K. and /q to be 
determined. For this purpose we can use such material as observations of spectral red-shift, 
apparent size, apparent brightness, distribution, and orientation of extra-galactic nebulae, 
and observation of sky brightness. There are also various determinations of the age of the 
universe, which therefore give /q provided atomic phenomena are used; these will not be 
discussed here, but an account of this work has been given recently (Bok, 1946). With 
regard to observables connected with extra-galactic nebulae, we can calculate theoretical 
expressions for them and deduce correlations which can be compared with the correlations 
actually observed. The data necessary to give K and / 0 are not yet available, but it can 
reasonably be expected that before long observations will not only give these constants but 
will also provide tests of the various relativistic theories of cosmology. 

Calculations of this kind, particularly in relation to the Lemaitre models of general 
relativity, have frequently been studied by the author and by other writers (McCrea, 1935; 
Milne, 1935; Walker, 1934, 1940 a ), and it will not, therefore, be necessary to give full details 
in the present paper. Our chief concern will be to find the degree of accuracy necessary 
to produce K and 4 and to point out the agreement between calculations based on r-time 
and those based on /-time, an agreement which is clearly to be expected since the two systems 
of time-keeping are logically consistent. The use of a particular time-scale depends upon 
whether nebulae are regarded as being at relative rest or in relative uniform motion, the former 
being given by the r-scale and the latter by the /-scale. It is thus not sensible to say that one 
of these motions is the “true” motion, and this is confirmed by the agreement mentioned 
above. Other writers have not taken this view and have given theoretical correlations which 
appear to make a true motion determinate.! In our opinion this is due to an inadequate 
description of the supposed mean nebula; this point will be mentioned again in §§ 11, 12. 

10. Red-shift. —For the purpose of correlating observables we shall express them in 
terms of the co-ordinate distance r from O in the space S 3 associated with conventional r-time. 
From (5) and (7) the metric of S 3 is (1) with K 0 = K/c% 2 , and this holds in all cases. The 
geodesic distance from O is 

In conventional r-time with zero at the present instant T 0 , light which reaches us at T 0 
left a nebula at distance / at an instant T given by r = - l\c. The frequency at T of a Fraun¬ 
hofer line which has present frequency v 0 was therefore v = and the Doppler ratio 

for this line is, from (12), 

(, 3 ) 

Thus D > 1 and there is a red-shift, of amount depending upon L 

The proportionate shift at wave-length A is §A/A = D -1, and this, we see, is independent 
of A. Observations agree with this independence and therefore with the constancy of k for 
all frequencies, deduced by us from the dimensional axiom. 

The calculation of D can also be carried out in /-time and gives the familiar result 

D=/ 0 //, (13') 


t See particularly Hubble and Telman, 1935. 



Time-Scales in Relativity 227 

where t is the instant at which the light was emitted. This agrees with (13) because of (8) 
with t = - lie. 

11. Distance from Apparent Size. —This observable is 

A — 

where Q is the solid angle subtended by the distant nebula at the observer, and A is the actual 
normal area of the nebula. In practice, A is estimated from near nebulse, a fact which must 
be taken into account in the theoretical calculation. 

Previous calculations of A in a Lemaitre model (Walker, 1934, § 7) can be applied to both 
r-time and Mime. They were carried out in space-time with metric (after transformation) 
R ! (i/T ! -rVe 2 ) : where de is given by (1) with IC 0 = K/A 0 2 , and R is a function of r; we now 
have R=i for T-time and R = tjt 0 (and drjt 0 = dtji) for Mime. In the Lemaitre model, 
distance from apparent size is Rr(i+Kr i /4c 2 tf')~ 1 , the area A being assumed measured at 
the distant nebula in the appropriate local scale of length. Applying this result to r-time, 
A for a mean nebula is constant by the dimensional axiom, and can therefore be measured 
at any instant. Putting R = 1, therefore, we get 

. r Kr 3 

A = i+Kr*l^ = r ~v\* + 0 ^- (l4) 


In /-time, R = /// 0 = i/D, and the Lemaitre distance is therefore the above A multiplied 
by D“ x . Now, however, lengths which were constant in r-time are increasing with /, and 
A as measured at the nebula is proportional to / 2 . The observer’s estimate of A from nebulse 
in his vicinity is therefore (/ 0 //) 2 = D 2 times the area of the distant nebula, and since the A 
in A = (A/fl) 1/2 is the observer’s estimate, we see that the Lemaitre distance must be multiplied 
by D. Thus A is given by (14) as before. 

12. Distance from Apparent Brightness. —This observable, A\ is obtained by comparing 
the apparent with the absolute brightness of a nebula. The observed (apparent) brightness 
is corrected for what Hubble calls the “energy effects”, due to the red-shift but independent 
of any particular theory as to its cause. The absolute brightness is constant in r-time by 
the dimensional axiom, and is in practice estimated from nebulae in the observer’s vicinity. 
In r-time the calculation giving the theoretical formula for A f is straightforward and similar 
to that for A; it is easily verified that 

A'=A. (is) 


In /-time the absolute brightness, being a time-rate of emission of energy, decreases as 
/ increases and is proportional to 1//. Local estimates must therefore be increased by the 
factor / 0 // = D to give the true brightness at the instant of emission. But the observed apparent 
brightness must also be increased by the factor D to allow for the “dimming factor”, i.e. 
the diminution in the rate at which photons are received due to the recession of the source. 
These two corrections therefore cancel in the calculation of A\ and we finally get the same 
formula as in r-time. A more detailed discussion of this problem has been given elsewhere 


(Walker, 1946). 

13. Number Counts. —Mean nebulse are supposed distributed according to the Cosmo¬ 
logical Principle, and the corresponding points in S 3 are therefore distributed uniformly. 
If n is the number per unit proper volume, then n is constant in time by the dimensional 
axiom, and if N(^) is the number of nebulse within co-ordinate distance r of O, then from the 
metric of S 3 we find 


r) = 47m I 

Jo 


o(i+K ^ 2 / 0 2 ) 3 


=^7rnr s [ 1 —- 


+ 0(r 7 ). 


14. Sky Brightness .—From the definitions of A' and n, the rate at which energy is received 
on unit normal area from all nebulse within a small volume dV of S 3 is D^A'^IndV, where 
I and n are constants and A' is the distance of dY. The factor D" 1 is included to allow for 
the diminution of energy due to red-shift, and I is the rate at which a mean nebula radiates 
energy in unit solid angle. It follows from (15) and (14) that the energy received from all 
nebulse within co-ordinate distance r and small solid, angle di 1 is L (r)dQ where 



228 


Time-Scales in Relativity 

f D^'-Wr T f D~Vr 
L W = Iw J o(l + K r%c%*)* J o i + K^ 2 /4 A 0 2 * 

From (12) and (13), (1 + K^ 2 /4^% 2 ) -1 — ctQD^dDjdr, and we find 

LW=^b(i-D- 1 ). (17) 

Since D > 1 for all we see at once that the total sky brightness is finite when K < o, 
and also when K > o provided it is assumed in this case that each nebula is seen only once, 
the alternative being infinite sky brightness. When K < o, the limiting value of L is ct 0 ln. 

15. Orientation .—It has been shown by the author (1940 a) that observations of nebular 
orientation may give useful information. Two observable angles ©, <D and the galactic 
latitude A are involved, and the theoretical results are 


K>o, ©= ±—+0(?' 2 ), 4> = o; 

ct 0 

K— o, @=o, <£ = 0 ; 

K< o, 0=o, 0 = -yCOsA.V-K + 0(^ 2 ). 


(iS) 


16. Correlations .—Observables D, A , A', N, 0 , <S> have been expressed in terms of r , 
and several correlations are obtained by the elimination of r. These are observable relations, 
which can provide tests for our theory and enable us to determine K and 4* That these 
constants are determinate is corroborated by the fact that they would emerge from a complete 
knowledge of, for example, the D-d correlation. 

The correlations will not be considered in detail here, but some conclusions can be drawn. 
We see from (13), (14), (15), and (16) that first and second approximations to the relations 
between D, d, A', and N give only 4. Third order approximations are necessary before 
these relations will yield K, even in sign, and it will be a long time before observations are 
sufficiently accurate for this purpose. It has been suggested (Hubble and Tolman, 1935) 
that the D-N correlation might be a way of getting K, at least in sign, and this appears at 
first sight to be confirmed by the form of (16). We find, however, from (12), (13)*, and (16), 
writing S = D -1, that 

N = const . . . 

This indicates how good the observations of S and N would have to be to give information 
about K, and supports my conclusion. 

The position is improved when observables © and 0 are admitted, for if the orientation. 
theory leading to (16) is confirmed, then correlations between 0 , <D, D and d (or A') to the 
first order give both 4 and K. This appears to be the most promising method of obtaining 
the desired information. 


REFERENCES TO LITERATURE 
Box, B. J., 1946. M.N.R.A.S., CVi. 61. 

Hubble, E., and Tolman, R. C., 1935. Astrophys. Journ., lxxxii, 302. 
McCrea, W. H., 1935. Zeits.f. Astrophys IX, 290. 

MlLNE, E. A., 1935* Relativity , Gravitation and World-Structure, Oxford. 

-, 1940. Journ. Lond. Math. Soc., XV, 4 

-, 1943. Proc. Roy. Soc. Edin A, LXII, 10. 

Milne, E. A., and Whitrow, G. J., 1938. Zeits.f. Astrophys xv, 263. 
Walker, A. G., 1934. M.N.R.A.S. , xciv, 159. 

-, 1937. Proc. Lond. Math. Soc., xlii, 90. 

- , 1940 a. M.N.R.A.S. , C, 623. 

-, 1940 A Proc. Lond. Math. Soc., XLVT, 113. 

-, 1943. Proc. Lond. Math. Soc., XLVlll, 161. 

-> 1946. Observatory , LXVI, 285. 


{Issued separately October 25, 1946) 



( 229 ) 


XXV.— Some^pontinuant Determinants arising in Physics and Chemistry. By 
D. S. Rutherford, M.A., B.Sc.. D.Math., United College, University of 
St Andrews. (With Four Text-figures.) 

(MS. received May 10, 1945. Read November 5, 1945) 

1. Introduction 

In many problems in physics and chemistry certain determinants of large order and of a 
special type require to be evaluated. The determinants in question usually arise from some 
kind of secular equation, and in most cases they owe their origin to a system of particles of 
one kind or another which, in their equilibrium positions, form a regular lattice, each particle 
being acted upon by its nearest neighbours and perhaps by a fixed boundary. These forces 
may be assumed to be elastic in character, to the degree of approximation required. For 
simplicity we shall refer to these particles as atoms although they may be electrons, molecules 
or even material particles in the Newtonian sense, according to the nature of the problem 
under investigation. In particular we may mention the occurrence of these determinants 
in problems involving the solution of Schrodinger’s Wave Equation for permitted energy 
levels and in determining the distribution of electric charge in crystals, metals and large 
molecules. They have also been used by Born in his investigations of crystal structure by 
means of X-rays. It is hoped therefore that the results achieved in this paper will be of 
interest to the physicist and chemist as well as to the mathematician. 

In § 2 some types arising from one-dimensional problems are considered, while with the 
aid of the mathematical analysis of § 3 certain two-dimensional types are dealt with in § 4. 
The hexagonal chains of common occurrence in organic chemistry give rise to determinants 
which are special cases of those evaluated in § 4. Certain three-dimensional types may be 
treated in the same way as the corresponding two-dimensional types, but others have so far 
proved intractable. 


2. One-dimensional Types 

Of the determinants which we are about to consider we shall distinguish between what we 
may call the fixed boundary type and the free boundary type. These types of determinants 
may arise in different ways, but we give them these names because they apply appropriately 
to simple mechanical models of lattices. These models consist of n equal material particles 
which are constrained to move in a straight line and which are linked together by equal 
elastic springs as shown in the following diagrams:— 

(-O-O-O-O-CD-0-O-O-0-1 

Fig. 1.—Fixed boundary. 


Fig. 2.—Free boundary. 


The secular determinants which give the normal periods of vibration for the two cases 
illustrated above take the forms: 


X 1 

1X1 

1X1 


(x + 1 ) X 

I XI 

I X I 

1X1 


IX I 

i 


1 (x + 1) 


(Fixed boundary) 


(Free boundary) 





230 


D> E . Rutherford 


It will be observed that the free boundary type is distinguished from the fixed boundary 
type by the presence of additional elements in the top-left and bottom-right corners. In the 
simple models described above, the n atoms are supposed to be identical, but we shall also 
consider the case where the atoms are of two kinds arranged alternately in the model. All 
the cases so far mentioned give rise to matrices of the form S m (u, v, a , b), where we define 



-(«+£) 1 -1 


-(u + b) 1 -1 


I V I 


I VI 


I u I 


I u I 

S 2n («, v> a., b)~ 

1 v 1 

, S iB+1 (a, v, a, i) = 

1 u 1 


I U I 


I V 1 


I (v + a) 

( 2 n) ' 

1 (u+a)_ 


<2n+l> 


We shall also use the further abbreviations 


&) — S w (tf, X ,* a , b ), P m(. x ) ~~ °)* 

With this notation it is clear that the determinants (2.1) take the forms 

I P«(*) I, I I, l) I 

respectively. 

It was shown by Wolstenholme (Muir, p. 401) that if we write 


then 


X—2 cos 6 , 


i p»(*) 


sin (m +1 )9 
sin 9 


(2.2) 


Using the addition theorem for determinants we deduce that 

i a, b) | = | P m (*) \+(a+S)\ P TO _ x (x) | +ab | V m _ 2 (x) | 
sin {m + i)d + (a + b) sin md + ab sin (m - 1 )9 
sin 9 


(2.3) 


With the aid of elementary trigonometry we can derive the following particular cases of the 
last formula;— 

. _ . x . 2 sin m6(i + cos 9) 

|Rm(*, I, I) |=-^-> 

sin (2 m +1)0/2 
sin 9 / 2 

It is clear from (2.2) that | P m (#) | vanishes when and only when 

k'TT 

0 =——, [£« 1, . . 

m+Y 9 99 

that is, when 


I R ™(*, h o) 


k'TT 


^ = 2 cos-, [/£ = i, . . m]. 

m + i 1 3 5 

These are in fact the m distinct roots of the equation | T m (pc) | =0. Since | P m (#) | is a 
polynomial in x of degree m and since the coefficient of x m in this polynomial is +1, we 
conclude that 


Similarly, it may be shown that 


[ Pm(#) ] = n (x - 2 COS 

*=i\ m +1 


and that 


k'TT \ 


I I> 1) I = II ( X - 2 COS — , 

&=i\ m) 

R m (*,i,o)|=n(*-2cos 

i=l\ 


(2.4) 

(a.S) 


2771 4 - I 


The determinant of the matrix S m (u P o, o) is easily evaluated by multiplying each row 





Some Continuant Determinants arising in Physics and Chemistry 231 

in which u occurs by a/v, each row in which v occurs by a/u, and dividing each column 
containing u by a/u, and each column containing v by a/v. This done, it is patent that 


I ®2n+l(^j &9 °) 


I S 2 „(k, v,o,6) | = | P 2n (Vuv) | = 

°) \ = \/l I p 2»+lC ^ UV ) l=\/“ • 


sin ( 2 n +1 )8 
sin 8 

sin ( 2 J n + 2 )0 
sin 8 


u sin (2 n 4 - 2)6 
sin 28 ’ 


where we now write a/ (uv) — 2 cos 8 . Using the addition theorem once more, we can obtain 
formulae for | S m (u, v, a, b) |. In particular 


I S In («, V, I, 1) I 

= I S 2 „(«, v, o, o) I +1 *, o, o) I + I S 2n _ 1 (u, v, o, o) | +1 S 2 „_ 2 (», u, o, o) | 

sin ( 2 ^ +1)0' (u 4- v) sin 2nd sin ( 2 *^- 1 )# 
sin 8 + sin 2 8 + sin 8 

(uv + u+v) sin 2nd 
sin 28 ’ 


I s 2n+1 («, v, 1 , x) | 

u sin ( 23 n + 2)6 + 2 a/(uv) sin ( 2 n 4 - 1)8 + v sin 2nd 
sin 2 8 


sin ( 2 n 4- 2 )# 4 - 2 • sin ( 2 n 4 -1 )8 4- sin 2 n 

sin 28 

2 {a/u sin (n 4 - x)8 4- aJv sin n8}{\/u cos (n + i)8 + a/v cos nd} 

sin 2 8 


3. The Latent Roots of Certain Matrices 


Let L(x) be a square matrix of order A such that 

L(#) = L(o) 4 -#I, 

where I denotes the unit matrix. Now L(o) can be reduced to its classical canonical form 
A(o) by a non-singular matrix H, such that 

HL(o)H _1 =A(o), 

from which it follows that 

HL(x)H -1 =A(o) +xl. 

Thus H, which does not involve x } also reduces L(x) to its canonical form 

A(#) s A(o) +xl. 

Now, if the equation 

| L(x) | — | L(o) +xl |=o 

has roots 4 , . . ., 4, then the latent roots of L(#), and therefore of A(x), are 

4? [2 = 1 , . . A], 

for the characteristic equation of L(x) is 


| L(x)-kI | = | L(o) +(x-k)I | = | L(#-k) | = o. 

Let M(#) be another square matrix of order [jl such that 

M(tf) = M(o)+*I, 

.and let L(M(#)) be the partitioned matrix 

L(M(x)) = L(o) < I > +1 < M(x) >. 

In the last formula we use the notation A<B> to denote the direct product matrix which is 
.more frequently denoted by A x * B. Thus, if 


—1 
£ 

l 

^» 

ifj 

Hi 

PQ 

r~ 

Si 

r 

* LAl ^22J 



232 

then 


D. E . Rutherford 


A<B> = 

^iiB 

#12 F 

— 


. ^11^12 

^12^11 

a 12pl2~~ 





a 11^21 

^11^22 

#12^21 

a 12&22 



^22-13— 


a 2l&ll 

<2 2 1^2l 

^22^11 

a 22&12 





—.#21^21 

^21^22 


^22^22 — 


L(M(#)) is in fact the matrix of order A jjl which is obtained from L(x) by replacing every element 
x by the submatrix M(#) and every element i by the unit matrix I. Using a theorem on the 
product of direct product matrices (MacDuffee, p. 82), we find that 

[H< I >][L(M(^))][H- 1 < I >] = [H< I >][L(o)< I > + I< M(*) >][H’ 1 < I >] 

- [HL^H" 1 ] < I > + [HIH- 1 ] < M(x) > 

=A(o)< I> +I< M(#)> 

=A(M(*)). 

It follows that 

I L(M(*)) I = | A(M(#)) I 

=n | K&t (*)) | 

i 

=n | M(x)-/ a j | 

=n 1 M(x-/ f ) | 

i 

= n (x 4 

where m 1} . . are the roots of the equation | M(x) [ = 0. Since the matrix L(M(#)) is 

itself the sum of a matrix independent of x and a matrix which is x times the unit matrix of 
order Aft, we have the following result:— 

Theorem: If the matrix L(#) is of the form L(o) +xl and the matrix M(x) is of the form 
M(o) +xl, then the latent roots of the matrix L(M(o)) are + 2 = 1,. . .,A;/=i, . . 

where 4 > • . ., 4 are the latent roots of L(o) and m l7 - . are the latent roots of M(o). 

4. Two-dimensional Types 

In the light of the results of § 3 it is now possible to deal with certain two-dimensional 
problems associated with both square and hexagonal lattices. Figs. 3 and 4 represent 
these cases when a fixed boundary is present and when there are two kinds of atoms arranged 
alternately. The bonds, that is to say the elastic forces, connecting the atoms to each other 
and to the boundary are represented by lines in the diagrams. 




Some Continuant Determinants arising in Physics and Chemistry 233 

The corresponding models of these lattices with free boundaries are obtained from the 
above by removing the fixed boundaries and the bonds connecting them to the atoms. 

Square Lattice .—Since the matrices P m (ac) and R m {x, 1, 1) are both of the type discussed 
in § 3, we conclude from (2.4) and (3.x) that 


7TI 71 / L * \ 

|P„CP»(*))|-n n l X - 2C0S ~ -2C0S^-, 

k=\ j=i\ m + i n + ij 

and from (2.5) and (3.1)'that 

I R m (R n (x, I, X), I, I) | - n n ( 3 C — 2 COS — 2 cos- 7 - 


k=l j=l 


(4-1) 


(4.2) 


The formulae (4.1) and (4.2) apply to square lattices with fixed and free boundaries respectively 
in which all the atoms are of the same kind. These results may be extended to the three- 
dimensional case without difficulty. 

If the boundary is fixed and there are two kinds of atoms present, the appropriate deter¬ 
minant is 

I S m (S n (K, v, o. 6 ), S n (y, u, o, 6 ), o, o) |. 


On multiplying and dividing the rows and columns of this determinant by ^/u and in the 
^same way as those of | S n (u, v, o, o) |, we may show that the, value of the former determinant is 

. 771 71 f J7t \ 

I Pm(P n (V uv)) | = n n ( Vuv - 2 COS —— - 2 COS —— ) 

&=i j=i\ m + i n +1/ 

or 

.. 771 71 ( - \ 

V(*/») I Pm(P»(V«o)) | --S/(»/») n n Vw -2 COS —— - 2 COS ——;) 

k=i §= i\ m + i n + i/ 


according as the number of atoms is even or odd. 
Hexagonal Lattice .—Let us write 


■^2n(A 7 ) S) — 


f PK . 

yln + rj'l, 

Dsn+i (P, i, s ) = 

pln+l ) 

qW+rR'l, 

Lql n +r], 



IqR + rK,. 

sl n -1 


where 


In- 


I 





H=rx 


i. 




i. 


(nx(n+l» 


. I I 

• L(«x{n+1)) 


where the dash (') denotes a transposed matrix and where p, r , s are non-zero scalars. 

Taking determinants of both sides of the following matrix equation 


r i , oir pi , 

L-(qI+rJ), pUlql + r], si J LO, 


ql+rj' 

(ps - y 2 )I - qr(J + J') - r 2 JJ'. 


we find that 


\p I | | D 2n 0> ; q, r,s) | = \pl | | (J>S -q*)I - qr(J +]')-**]]' | - 

If, after cancelling out the factor \pl | on each side, we multiply each odd row and each odd 
column of the remaining determinant on the right-hand side by — i, this determinant becomes 

\(ps-q*)I + qr(J + J') - r*JJ' | • 



234 E< Rutherford 

Further, the matrix JJ' is the same as the unit matrix I except that the bottom-right element is 
zero. It follows that 

I q, r, s) | =(qr) n 

A similar argument shows that 


Thus, the determinant | D m | can be evaluated with the help of (2.2) and (2.3). 

The matrix associated with the hexagonal lattice illustrated in fig. 4 takes the form 
S2w+i(U, V, O, 0), where we write 



By rearranging rows and columns it may be shown that 

IWU,V,0,0)|= WU> , (H'+K')<I 2W > 

(H + K) < I 2m >, I n <V> 

= I EWU, 1,1, V) | • 

This determinant may be treated similarly to | D 2ri+ i(/>, q, r, s) |. Taking determinants of 
both sides of the matrix equation, 

r i« + i<i2m> , o 1 r i„ +1 <u> , (H'+K'xi 2m >i 

L-<H + K)<I 2m >, I„<U>J L(H + K)<I 2m >, I„<Y> J 

= ri n+ i<U>, (H' + K')<I 2m > 1, 

L O , I n <UV> -{(H + K)(H' + K')}<I 2m >J 

we j 5 nd that 


I Da»+i <J>, 2, r, s ) I =J>(? r ') n p 


(ps -q 2 - r 2 ) 



I v M D 2 b+1 (U, I, I, V) | = | U I I„<UV> -{(H + K)(H' + K')}<I> |. 

Now it is readily verified that HH' = KK' = I and that KH' = J, HK' = J'. Thus 
I S 2 „ +1 (U, V, O, O) | = | U | | P„(UV - 2I) | 

n ■ I kir \ 

= l u in uv-2i-2cos—i! 


= |u | n uv — 4 cos 2 

Jfc=ll 

Again, an argument similar to the preceding one shows that 


(2 n +1) 


I D 2 (U, zl> O, V) | = | UV ~z 2 I |. 

On the other hand, if we rearrange the rows and columns of D a (U, zl, O, V) in the order 
2*# + 1, 2, 2^ + 3, 4, . . 2 1, 2^ + 2, 3, 2m +4, . . 4 m, it will be seen that 




Some Continuant Determinants arising in Physics and Chemistry 
Hence, if we write 


^35 


then 


2 * =2C0S ^)> 


hn uv-zi-1 


z k 


— 2 COS <f> k , 


I s 2n+1 (u, V, o, 0) H UI n I D 2 (U, **i, 0, v) 

Te -1 


= |U|n,f |R 2m (2 cos^, o) 


Jfc=1 


= - i) m n 


« sin (2m 4 - i)<l>k+ 2 i m sin 2m<£ fc 


sin <£ fc 


Two particular cases are of special interest. If n = i, the determinant is 

sin (2 m 4 * i)(f> 4 - 2~% sin 2m(f> 


I S 3 (U, V, 0 , O) | = 2 m («z»- v ^ - sin ^ 


/ - i) m | ! 


* 


where 2 cos - 3) j\/2. This determinant is the appropriate one for a hexagonal chain 

of 6 m atoms. 

On the other hand, if we take m — i, we have the determinant 

* 2* sin zSjc + zl sin 2<£ A 
IS 2 „ +1 (U,V,o ; 0)1=(«0-1)n - --f k i—- 

z k sm ?Jc 
n 

= (W ~ l) II (4** COS 2 <j> k + 2 S* COS <j> k - 

Jfc=l 


Hence, if we write 


71 

=-1) n {(^ - *f) 2 - 

*=1 


4 = V( I + 


4 . 1 )-^ 


9 4- 8 cos 


hir 
n + i)/ 


we obtain in the case m = 1 
I S 2ffl+1 (U, V, O, O) I ^ 

= (UV - i) n {V UV 4* i(i + 4)}{Vw - i(i + 4)}{V ua + |(i - 4)}{V- i(i - 4)}- 


This determinant arises from a hexagonal chain of ^n 4 - 2 atoms. 


5. Conclusion 


In this paper we have evaluated a number of determinants of a type which are likely to 
occur in theoretical physics. Some of the simpler cases have in fact been already discussed 
by other writers. Thus Goodwin (1939) deals with an equation of the type | R m (#, a y a) | — o 
and discusses its roots at some length. Lennard-Jones (1937) k a s evaluated a determinant 
which, on rearranging the rows and columns, becomes | D 2r (€, ft, ft, e) |. Lennard-Jones 
and Turkevich (1937) considered a determinant which reduces to 

| D*<€, ft, ft, €) | -ft I D 2v _ 2 (€, ft, ft, c) |-2(ftft) 2 . 


These and other cases have also been treated by Coulson (1938) while Born (1942) dealt with 
| S w (&, L *0 I* _ , . 

No doubt the methods described here can be extended to cover other cases but in some 
cases, such as that of the hexagonal lattice with a free boundary, there appears at present 
to be an insuperable difficulty. Nevertheless, as Ledermann (1944) i ias shown, the con¬ 
ditions at the boundary can be modified without effecting an essential alteration in the density 
of the roots of the secular equation. Thus, if the density of the roots rather than their 
actual value is required, the hexagonal lattice with a free boundary can be replaced by one 



236 Some Continuant Determinants arising in Physics and Chemistry 

with a fixed boundary. Again, although the boundary of the hexagonal lattice in fig. 4 is a 
rectangular, one, we may imagine two opposite edges joined together to form a sort of 
cylindrical network. This cylinder can then be cut along a different line and opened out 
again to form a hexagonal lattice with a boundary in the shape of a parallelogram. In this 
case also there will be no essential modification of the density of the roots of the secular equation, 
provided the total number of atoms in the lattice is large compared with the number of atoms 
whose immediate neighbourhood has been modified. 


REFERENCES TO LITERATURE 

BORN, M. } 1942. “Lattice dynamics and X-ray scattering”, Proc. Phys. Soc., LIV, 362. 

COULSON, C. A., 1938. “The electronic structure of some polyenes and aromatic molecules IV”, 
Proc. Roy. Soc., A, CLXIV, 383. 

Goodwin, E. T., 1939. “Electronic states at the surfaces of crystals”, Proc. Camb. Phil. Soc., xxxv, 
221. 

Ledermann, W., 1944. “Asymptotic formulae relating to the physical theory of crystals”, Proc. 
Roy. Soc., A, CLXXXii, 362. 

Lennard-Jones, J. E., 1937. “The electronic structure of some polyenes and aromatic molecules I ”, 
Proc. Roy. Soc., A, CLVIII, 208. 

Lennard-Jones, J. E., and Turkevich, J., 1937. “The electronic structure of some polyenes and 
aromatic molecules II”, Proc. Roy. Soc., A, CLVIII, 297. 

MacDuffee, C. C., 1933. The Theory of Matrices, Berlin. 

MUIR, T., 1923. History of the Theory of Determinants IV, London. 


(Issued separately May 8 , 1947 ) 



( 237 ) 


XXVI.—The Universal Integral Invariants of Hamiltonian Systems and Applica¬ 
tion to the Theory of Canonical Transformations. By Hwa-Chung Lee, 
Ph.D. (Edin.), Wuhan University, China. Communicated by Sir Edmund 
Whittaker, F.R.S. 

(MS. received October i, 1945. Read December 3, 1945) 


i. Introduction .—Consider a Hamiltonian system of differential equations 


d qi an dp t sh 

dt dpt’ .dt dqi (* * * § -*> •••>*)> 


(i.x) 


where *H is a function of the 2 n variables q i and pi involving in general also the time t. For 
each given Hamiltonian function H the system (1.1) possesses infinitely many absolute and 
relative integral invariants of every order r — i, . . 2 n, which can all be written out when 
(1.1) is integrated.* Our interest now is not in these integral invariants, which are possessed 
by one Hamiltonian system, but in those which are possessed by all Hamiltonian systems. 
Such an integral invariant, which is independent of the Hamiltonian H, is said to be universal, 
A universal relative integral invariant of order 1 is $00, where c 0 is the Pfaffian form f 


n 

»=i 

and consequently (by Stokes’s theorem) a universal absolute integral invariant of order 2 is 
JV, where a/ is the exterior derivative of to, namely 

*>'~2 8 A%’ (i*3) 

i 


and then the integrals Jo/ 2 , Jto' 3 , . . ., Jco' n are universal absolute integral invariants of orders 
4, 6, . . .,2 n respectively,$ where 

a/ 2 = = 212 ) 

i,j i<j 


o >' B =2 fyfiqifyfiq&pjfifk = 3 ! 2 

i,j,k i<j<k 


(l-4) 


»'*-2^* . . . SfiMin^nlSpSqt . . . Sp n Sq n> J 

and the integrals <fwco', §oja)' 2 , . . $toto' w_1 are universal relative integral invariants of 
orders 3, 5, . . ,, 2n -1 respectively^ where 


o>a>' =2 

id 


OX*)' 2 = 2 PfafipfofiP&fTn 
i,j,k 


(i-S) 


a>cti' n - 1 ='£ l pi l 8 q il 8 p i J>qi 2 ■ - - 8 p in 8 q in .\ 
in ■ ■ i 


* For r = i, see E. T. Whittaker, Analytical Dynamics, 2 nd ed. } § 117. For any r, see E. Goursat, Leg on s sur 
le prcbUme de Pfajf, p. 214. 

t Whittaker, l.c. } p. 272. 

j Goursat, l.c., p. 231. 

§ This follows from a known theorem (Goursat, l.c,, p, 212) by noting that (coco')'=(oxo ) =ct> • • •? 
(axo' n - 1 )'=a>' n . 





238 


Hwa-Chung Lee 


We shall show that these Poincare's integrals are essentially the only universal integral 
invariants,* i.e, there is no universal absolute integral invariant of odd order, and the only 
universal absolute integral invariants of even orders 2, 4, 6, . . .,2 n are, to within constant 
multiples, respectively the following:— 

JV, JV 2 , Jo/ 3 , . . JV”, (1.6) 

and there is no universal relative integral invariant of even order, and the only universal relative 
integral invariants of odd orders 1, 3, 5, . . 272-1 are respectively constant multiples of 


<po), <JkoV 2 , . . ., §cooj' n \ (1.7) 

Our method is tensorial, as this is much more powerful than the method generally used 
in questions connected with Hamilton’s canonical equations. 

2. Methodical Preparations. —For unification of notation the 2 n variables^, . . q nr 
p u . . .,p n will be written as x\. . x n+1 , . . ., x 2n respectively, thus 

x <ss fi> X n+i =j>i, (z = I> •••>»)• (2* 1 ) 


Reserving the Latin indices 2,/, . . . for the range 1, . . ., n, we shall use Greek indices for the 
double range 1, . , 2 n. The summation convention for Greek indices will be understood. 
Thus (1.1) may be condensed into 


where 


dx a 

~dt 




h - e L 


e’ J 


€ °P 


e n + ],i = _gi 


€ » + i,»+i_ 0) 


( 2 . 2 ) 

(2-3) 


a delta (with two indices) having value 1 or o according as the two indices are equal or unequal. 
In view of (2.3) we may represent e a/3 by the skew-symmetrical matrix 


0 1 
e l-I o 


(2.4) 


where O and I denote the zero and unit matrices of order n respectively. This skew- 
symmetrical matrix, of order 2 n, is non-singular, and so has an inverse which is again skew- 
symmetrical and which we write 


€ a/ 3 “ 


0 -I\ 
I Oh 


(2-S) 


whence 




€ i, n +j % 




( 2 . 6 ) 


As € a ^ and are inverse to each other, we have the relations 


<J (ja 

€ e 


^aCG 




The skew-symmetry of the two epsilons is expressed by the equations 


(2-7) 


e a ^=-€ /Sa , e afi =-e /3a . (2.8) 

We shall have to make use of the variational equations f of the Hamiltonian system (2.2), 
which are obtained by operating on (2.2) by the differential symbol 8. The two differential 
symbols d and 8 being commutative, the variational system of (2.2) is then 


d( S*°) 

dt 


= € a3 H PY 8^ 


3. Absolute Integral Invariants. —Let 




SH ± _ 

8 xt = 


S a H 

'dx^dxi 


(2-9) 


&r=A ai ... ar S* ai . . . 8 x ar (3.1) 

be an exterior differential form of degree r (x ^r < 2n), where A ai ., >0r is skew-symmetrical in 
the indices a u . . a r , and is a collection of functions of the variables x and of the time t. 

That <jico is the only relative integral invariant of the first order for every Hamiltonian system was believed 
to be true and posed to me by Professor J. S. Wang, to whom I am indebted, 
f Whittaker, Lc. s p. 269. 



The Universal Integral Invariants of Hamiltonian Systems 239 

By definition, the T'-ple integral J£ 2 r is an absolute integral invariant (of order r) of the 
Hamiltonian system (2.2) if, in consequence of this system (2.2) and of its variational system 
d 

(2.9), we have — JX2 r = o for an arbitrary (^-dimensional) domain of integration, whence 


or 



d 


-(A ai ... ar 8* ai . . .8**)-o. 


In consequence of (2.9), this may be written: 

^ r + e x '‘(A Xa2 ... 0r H |liai + . . . +A ai ... ar _ lX H (iar )Jsx ai . . . S# a ' = o. 

Now it is easily seen that the expression inside the brackets () is skew-symmetrical in the 
indices a l5 . . ., a r , so that the whole expression inside the brackets [ ] is skew-symmetrical 
in these indices. Since the (r-dimensional) volume-element S#° l ... 8 x a r is an arbitrary 
skew-symmetrical quantity, its coefficient in the above identity must vanish: 

dA 

—^“ r + €^(A Aa2 ... ar 8 v ai + . . . +A ai ... ar _ lX 8UH^=o, 



which, when the first term on the left is differentiated in consequence of (2.2), becomes 


dA 6 A 

+ • • - + Sa r A ai ... ar-iOH^v “ O. (3.2) 

This is the condition that J£l r be an absolute integral invariant of the Hamiltonian system 
(2.2). 

In order that this integral invariant Jf 2 r be universal, (3.2) must hold for an arbitrary - 
function H, whence 

^■A'dl , . _ a.r 3A .\ .. / \ 

(3*3) 


r dA ai " Mr x 

dt dx x € ’ 


^(S^A X a,..a r + * * * +S^i...^ 1 x) + ^(SS 1 A Xa2 ... ar + . . . +8£Aa 4 ...«r^)-0- <3*4> 

Of the two conditions in (3.3), the first implies that A ai .. , ar is independent of /, and the second 

dA 

(where fx is a free index) is equivalent to —= o, so that A ai ._ ar is also independent of 

the x’s. Thus the A’s are all constants . 

For the solution of (3.4) we consider first a few simple cases. 

4. The Case r = i. —In this case, (3.1) is a Pfafifian form £ 2 !== A a 8 x a , and (3.4) becomes 


c^Ax + c^Ax-o. 

Contracting this for v and a we find (2 n + i)e x ^Ax = o, whence (transvecting by e^) 

(an + 1 ) A a — o, 

i.e. A a = o, which is the only solution of (3.4) for r = x. We have then 

Theorem x. There is no universal absolute integral invariant of the first order . 

5. The Case r — 2.—In this case, (3.1) is of the form £ 2 2 = A ap 8 x a 8 x^, and (3.4) becomes 

^( 5 ^+ 8 ;^) + e Xv (K&xp + 8 ^= 0 . (5.x) 

Contracting this for v and /J we find 

ane^Aoi + 8 % A = 0 (A = A^ 0 ^ 



240 


Hwa-Chung Lee 


whence (transvecting by 

2 nA af3 + e a $A = o, 
or 

A 

A ^ = ~7n aff 


(5-2) 


which is the most general solution of (5.1), i.e. of (3.4) for r — 2, where A is an arbitrary constant. 
Now by (2.6) and (2.1) we find 

€ a/S Sx a Sx^ = 2 ( 8 /^ -8? i 8f i ) = 2HSp i 8g i , (5*3) 

so that (5.2) implies 

A A 

Q 2 =A aji 8x a 83cP = - ~^e a ^8x a 8x^ = ~ —SS/iS^, 

and we have 

Theorem 2. Apart from an arbitrary constant factor , there is only one universal absolute 
integral invariant of the second order , namely JE8^8<fr. 

6. The Case r — 3.—In this case, (3.1) has the form £ 2 3 = A a ^8x a 8x^8x y , and (3.4) becomes 

€ x "(S;A xa +8JAox 7 +8’A oj}x ) + ^(8^ + 8j»A aX7 +8?A 0/3X ) = o. 

Contracting this for v and y we find * • 

(2n - i)e X/z A a ^ - 8jA p + S£A a =o (A a = A a ^), 
whence (transvecting by <^ 7 ) 

, (^*~i)A a ^ 7 -e a7 A /3 4 -e P7 A a = o. 

Transvecting this by e a/3 we find 

(2^ + i)A 7 = o, 

i.e. A 7 =o, and therefore the above equation implies A a/37 = o, which is the only solution of 
(3.4) for r = 3. Hence 

Theorem 3. There is no universal absolute integral invariant of the third order . 

7. The Case r= 4.—In this case, (3.1) has the form £ 1 4 = A a ^x a 8x^8x y 8x a i and (3.4) 

becomes - ■ 

^S^W+SIA^+S^A 

afiXa "t 

+ e ^A Afto +8'fA oX70 +8^ 0 +S'fA a(S7X )=o. (7.1) 

Contracting this for v and a we find 

(2*-s) e ^A a(j7X +8SA (a7 -8^A ay +8^ = o (A a/ 3=A a?7a e 70 ), 

whence (transvecting by 

(2ft-2)A^ a + €aorAy3 7 "1" A 7a “h €^(jA ( 2 y 3 O. 

Transvecting this by € a/3 we find 

2«A^-f €~ j0 A = 0 (A = A a p€^ = A a/37a € a/3 €' lff ), 

whence 

A ^ = 

and therefore 

A 

Aaft^ a ~ y + + ^ a € a ^) } ( 7 * 2 ) 


which is the most general solution of (7.1), 2.*?. of (3.4) for ^=4, A being an arbitrary constant. 



The Universal Integral Invariants of Hamiltonia?i Systems 


241 


Now, by (7.2), 

= A a ^ a Bx a 8x P Bx^X a = ~ n ^ 2n _ 2 ^ a3$% a <$ xl3 ) (e yo 8^ v S^°) 
=^f~)( 2S A8$ r i)(S8AS^) by (5.3), 

and we have 

Theorem 4. Apart from an arbitrary constant factor , there is only one universal absolute 
integral invariant of the fourth order , namely fS&pf>q$pf>qy 

8. The General Case. —Continuing the solution of (3.4) for any r in the same way it is clear 
that we have 

Theorem 5. There is no universal absolute integral invariant of any odd order , and apart 
from an arbitrary constant factor there is only one universal absolute integral invariant of every 
even order 2s, namely . . . 8 p is 8 q is . 

In fact, contracting (3.4) for v and a r we find 

(2»-r + 2)e A '*A ai ... ar _ 1 * 

+ ( - ..«_! + ( - l) T -%K ia ,...ar-i + • • • 

(where A 0l .. , ar _ 2 = A 0l .. . ar e°’-i a '), 

whence (transvecting by 


(s#-r + 2)A ai ... ar 

+ ( — l) r 2 € , a 1 Or^a 2 a 3l 


+ (- i ) r - 3 e aa < Jr A aiC , 3 ... ar _ 1 + 


^ar— iOtAojOj. 


.ar-2 


= 0. 


Transvecting this by € ai<Z2 we have 


( 272 ~^ + 4)A a3 ... ar . , 

+ ( ” l) . .ar-i ^^ar^ajaa.. .a^i ^ar-iOr^aa *. .at —2 ~ ° 

(where A ai .. .a r - 4 = A fll .. . ar _ 2 e a ^ a r~2 = A ai .. ^ € a r-2 a r-2e a r~l a r). 

Continue to transvect this by e** 0 ', then the result again by € a ‘ a ‘, and so on, we see that if 
r is odd, the result after transvection by e a r-2 a r~i is 

{yn +i) A ar == o, 

whence (by tracing the equations back successively) 

A- Gr — o, A aT ^^ ar— ^j r — o, A Gr- , 4>#iiGr o, ... 

and finally A ai ... ar =o which is the only solution of (3.4) in this odd case. The first half of 
Theorem 5 is proved. 

On the other hand, if r is even, the result after transvection by £ a r-s a r-2 i s 


or 


2^A af _ l0r + € ar _ iar A = O (A = A ai ,..a r € a ^ . . . 


A ar ~ia r “ 2^^ Gr— l GfJ 


Aa ^" 2/^ 


whence (tracing back successively) we obtain the solution of (3*4) bi this even case, which, 
involves the arbitrary constant A and is of the form 


• __ ( " I)lA -(« 

2) . . . (2«-r + 2) v “ l0 *' 


• ■ € ar-lOr"^" • • • ) 


where the unwritten terms in the brackets () are of the same type as the first term and are such 



242 Hwa-Chung Lee 

that the entire expression inside the brackets is skew-symmetrical in the indices a l9 . . •, a r ; 
thus there are (r- 1) (r -3) . . .3.1 terms in all. Hence 


Q, = Aa t „.a,8* * * §ai ■ • • &*“>■“ 


(-i) 2 i -3 • 


27 l( 29 Z - 2) . . 
*•3 * 


= (-*)' 


n(n -1) 


• - i)A 

(2 n -r + 2) 

. (2^-1)A 
. . (n-s + 


0 ai a 2 Sx a ^x° s ) . . . (e ar ^a r Sx a r-lSx a r) 

7 ) (Z&M . * * ( 2 SA-. 8 ? is ), 


where we set r = 2s. The second half of Theorem 5 is also proved. 

9. Relative Integral Invariants .—If §Cl r is a relative integral invariant of order r i then 
JOr is an absolute integral invariant of order r + 1, and conversely,* where fir is the exterior 
derivative of Q r . Hence, when universality is understood, we have by § 8 

£ir=o for even r, ( 9 - 1 ) 


* - • tyJlis for odd r = 2s-i, (9.2) 

where C is a constant. 

If (9.1) is the case, then f £i r itself must be the exterior derivative of another exterior 
form A (of degree r - 1): fi r =A', so that 

<f>Q r = JQ' (by Stokes’s theorem) 

= JA" = o (since A" = o by Poincare’s theorem), 


and we have 

Theorem 6 , There is no universal relative integral invariant of even order . 
If (9.2) is the case, we may write it in the form 


(Q r - C 2 • • • S A 8 A,)' = °. 

Hi • ■ • 1 4 s 

whence, by the same reason, the expression inside brackets must be of the form A', so that 

Q r =CSA 1 S^ 1 S^- 2 . . . Sa*8^*+A', 

therefore 

. . . 8 >pi,8f u (since #A'-fA'-o), 

and we have 

Theorem 7. Apart from an arbitrary constant factor , there is only one universal relative 
integral invariant of odd order 2s - 1, namely 

§^Pifoi$Pi&H ■ • - 8 A 8 fV, ( S=I > • • ■,»)■ 

10. Systems which possess a given Integral Invariant.% We now study the converse 
problem suggested by the preceding results, namely that of determining all the systems of 
differential equations which possess one of the integral invariants obtained above. Let 

*!± = o ‘th-v 

dt dt 1 


be a system of 2 n differential equations where the Q’s and P’s are unknown functions of the 
fSyp’Sj and t. We may always write such a system in the form § 


dx a 


(10. x) 


* Goursat, l.c., p. 212. 

f E. Cartan, Legons sur les invariants iniegraux , p. 73. Also Goursat, l.c., p. io<. 

X Compare Whittaker, l.c., §116. 

§ Where X$= -P t - } X„ +< =Q l -. . 



The Universal Integral Invariants of Hamiltonian Systems 243 

the X’s being unknown functions of the x’s and t. The variational system of (10.1) is 


d( 8 x a ) ax, 

—- '- — M -£g~Y 

dt ex'* ' 


We find in consequence of (10.2) 


where 


— (e al £x a 8x l> ) = Y a fjx a 8x 13 , 


8X a 8X? 
Ya/3 8x? dx a ’ 


(10.2) 


( I0 -3> 


which is skew-symmetrical in a and fi. If then J'e ajS Sx a Sx' 3 is an absolute integral invariant 
of the system (10.1), the left-hand side of (10.3) vanishes, so that Y aj3 = o or 

sx a ax„ 


ox? 8x a °’ 


which implies the existence of a function H of the 5c’s and t such that 

X-- 


(10.4) 


(10.5) 


and therefore (io.i) is a Hamiltonian system. We have thus given a short proof of the known 
theorem that a system possessing the absolute integral invariant JS8^S^( = i$€ a $x a 8 xP) is 
necessarily a Hamiltonian system. We wish to prove more, namely 
Theorem 8. Every system which possesses one of the integrals 

• • • SA ’Mi, 0 = 1, . . n) 

as absolute integral invariant is a Hamiltonian system. 

Since by (5.3) 

. . . 8 f ( Mi,= s (€ aA 8 x a ' 8 x fi ') . . . (e 0 , A Sx o *SxA), 
we have . 

■^{(e^Sx^x^) . . . (e a „ff a Sx a ‘Sx^)} = o, 

or, by (10.3), 

(Y aiA Sx a ‘Sx A ) . . . (<r asA 8x a »8xA) + . . . + (e aiA 8x ai SaA) . . . (Y aA 8 x a ‘ 8 x^>) = o. 

The s terms on the left of this equation are equal to one another, so that we have simply 

(Y aif}i 8 x a 'Sx^) . . . (e a ^ a ^)=o, 

or, removing the arbitrary skew-symmetrical quantity 8 x ai 8 x & 1 * . . 8 x a * 8 xP*, 

^ * * * € (*aps • • • • “ O, 

where the unwritten terms of the sum on the left are of the same type as the first term and are 
such that the sum as a whole is a skew-symmetrical quantity. Transvecting the above equation 
by € <hP* . . . € °A the result may be written Y ai/?i '=o, whence we have again (10.4) and 
therefore (10.5). Theorem 8 is proved. 

Theorem 9, Every system which possesses one of the integrals 

S^PiMiMMi, ■ ■ • ¥Mu 0=i,.... *) 

0.T relative integral invariant is a Hamiltonian system. 

For, the sth integral in Theorem 9 equals the sth integral in Theorem 8 by Stokes’s theorem, 
so that if a system possesses the sth integral in Theorem 9 as relative integral invariant it also 
possesses the sth integral in Theorem 8 as absolute integral invariant, whence the system is 
Hamiltonian by Theorem 8. Theorem 9 is proved. 



244 


Hwa-Chung Lee 


ii. Application to the Theory of Canonical Transformations. —Making use of the above 
results, we can determine the canonical transformations by a very simple argument. By 
definition a transformation from the variables y i, . . ., y n) Pi> • • • > Pn 1 ° tbe new variables 
q^p^ . . is called canonical if it changes every Hamiltonian system (i.i) again 
into a Hamiltonian system 


5 d l±_ 

dt ~8fi’ dt~ dqf 


(/=!, . . 


(ii.i) 


Then, since JES^-Sy,: and are both universal absolute integral invariants of the 

second order, they must coincide except for a constant factor (Theorem 2). Hence a canonical 
transformation must satisfy the condition 

’ZSpfq^cXhpfZi, (n.2> 

where c is a constant.* 

This condition, being necessary, is also sufficient. To prove this we first write (11.2) in 
the form 

€ p(T Sx p 8x a — c€ a pSx a 8 x^ [see (5.3)], 


whence 


8x p 8x a 

epa ax a ^f =ceafi ' 


(n- 3 > 


An equivalent form of this is 


Jdx p 85 c G 

C&P -— £e pO 

dx a 8 x^ ' 


(11- 4 ) 


The sufficiency proof is to show that if (11.2) is satisfied, (1.1) is transformed into (11.1). 
establish this by deducing (11.1) from (1.1) making use of (11.2) as follows:— 


dx p 8x p dx a 8x p 
dt 8x a dt * 8t * 
8x p a £$H 8x p 
8 x° € 8x& + 8t 
8xP_ 

C€ 8x G 8x^ 8t 


[by (1.1) condensed in the form (2.2)] 
[by (11.2) in the equivalent form (11.4)] 


8(c H) 8x p 8 H 

B f P a ——- 4. — — e pa — 
8x G 8t 8x G ’ 


We 


which is a condensed form of (ii.i), where 

H —^rH 


and <f> is defined by 



8 x p 

~ 8 t 


or 


8<f> 8x p 8x G 

= ep<r a? Hi’ 


(II-S) 


(n.6> 


which is completely integrable since its condition of integrability is satisfied because of (11.3): 
8 /8<j>\ 8 /8cf>\ 8 / 8x p 8x G \ 8 

8s?\8x a 7 " 8x a \8s?) = 8f\ epa dx° dxf = °' 


The sufficiency proof is complete, f 


* A different method (without using integral invariants) of obtaining this condition, in the equivalent form 
(11.3), and of deducing the equations (11.5) and (11.6), is indicated in H. C. Lee, “Sur les transformations des 
congruences hamiltoniennes,” Comptes rendus (Paris), CCVI (1938), p. 1431. A generalisation of the method is 
given in H. C. Lee , u On even-dimensional skew-metric spaces and their groups of transformations,” Amer. Journ. 
of Math*, lxvii (1945), p. 327. 

t We have tacitly supposed for generality that the canonical transformation involves the time t. If in 


particular the new variables x as functions of the old x’s are independent of t, (11.6) reduces to =o, whence 

^ is a constant (or a function of /). Since to a Hamiltonian function can be added an arbitrary function of t 
without affecting the corresponding Hamiltonian system, the transformation law (11.5) in this case'may simply be 
taken in the form H r 



The Universal Integral Invariants of Hamiltonian Systems 245 

. Now > (“■*) may be written (LpMi-cZpfqf = 0, whence ' 2 $M i -eX#$q t is an exact 
differential Sw of a function w(g r , . . ■, ImPu ■ • ■, p n , t), t being unvaried under the symbol S. 
Hence the condition (x 1.2) for canonical transformation is equivalent to 

'LpMi =+ Szo.* (11.7) 

With the aid of w, we can actually integrate the system (11.6). For this purpose we first 
write the condition (11.7) in the form 

gfx“ - cgf>x a = 8 w, 


where g t =f i} g n+i = o, and similarly g { =p h g n+i = o(z = i, . . We find 

0 I\_ 

8 x& 8 x a \-I OJ e °0’ 

and similarly 

8 x a 8 x> €po- 

Then, the above condition gives 

dw _ dx p 
8 x a= ^ 8 x a ~ Cga ' 

whence 


so that 


d 2 w d / dx p \ _ d 2 x p dg p dx a dx p 
8 i 8 x a = 8 tf 9 8 x a )~ 0= gp 8 t 8 x a + W°l)t'dx a 

___ _ d 2 x 9 f dg p dg a \dx a dx p dg a dx a doc 9 
= Sp 8 x a 8 t + \ 8 l a ~ 8 ^) 8 Tdx a + 8 ^~ 8 i 8 ^ a 
d 2 x p dx p dx a dg a dx a 
= gp 8 x a dt ~ Spa 8 x a ~ 8 i + 8 x a ~ 8 t’ 

dx p dx c _ d 2 x p dgp dx p d 2 w 
*» a 8 x a ~ 8 t =g, ‘!hPdi + faP 8 t ~ 8 t 8 x a 


d / dx p \ d 2 w d / dx p dw\ 
= frfflTt J ~ fa? 8 t = 8 x a \ gtr di ~ 'dih 
and therefore (n.6) is integrated into 

_ dx p dw 

<p ar bitrary function of t. 


Hence (11.5) may be written in the form 


XT ZX dW V - 

H= " H -a 7 + ,§^ 


d Jk 

8t’ 


(n.8) 


dropping an additive function of /. 

The above results may be stated in 

Theorem ro. A necessary and sufficient condition for canonical transformation is (11.2), 
where c is a constant , which condition is equivalent to ( 11.7), where w is a function of{q ly . . 
q n ,p^ . . -,f n > l)' The corresponding change of Hamiltonian function is given by (11.8). 
Note that by (2.3) the left-hand side of (11.4) may be written 


y to dx° dx 9 df\ 

Wi i~%pi m y 

which is the well-known Poisson s bracket-expression (x 9 , x°). Hence the condition (11.4) for 
canonical transformation decomposes into 

{ o when i 4 7, 

c when i=j. < II * 9 ) 


* Many authors, using other methods, come to this conclusion with c—i only. See notably EncykL der 
Math . Wtss.j m, 3, p. 454; Handbuck der Physik , Band V, p. 99; Forsyth, Theory of Differential Equations, 
part IV, vol. V, p. 399. Although the distinction between c = i and c 4= 1 is slight and is not very important in 
application, it is a difference in theory. 



246 The Universal Integral Invariants of Hamiltonian System's 

Note also that another equivalent form of (11.3) is 

dx a dx& 

**-“'*& aP 


(ii.io) 


of which the right-hand side may be written in the form - c[x p , x a ], where [u } v] denotes the 
well-known Lagrange's bracket-expression 



[?«.?}] = 0, \-Pi ,A-] = °, W; 




(H.II) 


Hence the condition (11.10) for canonical transformation decomposes into 

when i 4 = j, 
when i ~ j. 

It is seen that if the constant c is unity, a canonical transformation is a contact transforma¬ 
tion* We emphasise this fact by saying that the group of canonical transformations f 
contains' the group of contact transformations as a proper subgroup . 

We can define a canonical transformation in a specific manner t by the equations 

• • *> In, > • • •> q n , l) — o (s = i, . . m\ o< m g n)A 


Pi Ot 5 

Mi 5=1 Mi 

8 W ^ , 3 M S 
c Pi ~ ft/* o 5 

^ 5=1 


(/=i, . . n), 


(11.12) 


where W is a function of (<? 1} . . £ n , /) such that 

W(ft, • • ft, • - = ?«, A, 


We find 


■>A >4 


to_SW y aw dq> 

dt “ 0/ 0? 

fifiaft + ,tr 8 dq f ) et 

V 8W 8 ii 


y ow oqt y./y 9M, gft' 

/a % ^ \tT\Sf 8ft & 

_^ _awag, y 
<s! Sft a/ ^ As a/ ’ 


so that (11.8) becomes in this case 




(”•13) 

In particular, if * = 1, (11.13) is known as*the law of change of Hamiltonian function § corre¬ 
sponding to the contact transformation (11.12). However, the form (11.13) of the law is less 
significant^ than the general form (11.8), because the multipliers X l9 . . ., X m are 
quantities irrelevant to the canonical transformation, being merely parasites of the process of 
analysis. 

see* 4 ^^tef/“ Pp a 2 C 9 “ Wormati011 “ terms 0f La ^e J s and Poisson’s bracket-expressions, 

(l 1 “ rtS (11 t i r o a ) nsformations folms a S rou P ca * be seen from any one of the conditions 

J Compare, for the special case of a contact transformation, Whittaker, Z.c., p. 20? 

§ Whittaker, l.c.. p. 309. ’ v 50 ‘ 

Academia Sinica. 


(Issued separately May 8, 1947) 



( 247 ) 


XXVII.—Expansions of Lame Functions into Series of Legendre Functions.* 
By A. Erddlyi, Mathematical Institute, The University, Edinburgh 
(MS. received October 5, 1945. Read January 14, 1946) 


Introduction 


1. The object of this paper is to investigate expansions of Lame functions into series of 
associated-Legendre functions. Representations of Lame polynomials by (terminating) series 
of associated Legendre polynomials are not entirely new. In fact they are almost as old as the 
theory of Lam6 polynomials itself. In a letter to Bor char dt, published in Crelle’s Journal , 
Heine (1859 a ) points out that the characteristic feature of his own approach (as compared 
to Lame’s original treatment) is the representation of Lame’s polynomials by series of 
Legendre functions (Heine, 1859 £) instead of series of powers (as in Lame’s investigations). 
In Heine’s investigations the series of Legendre functions appear quite naturally from some 
considerations on ellipsoidal surface harmonics and their connection with spherical surface 
harmonics. The main theoretical interest, in Heine’s opinion, of his series comes from 
their coefficients being deducible from the coefficients of certain orthogonal substitutions. 

It seems that Heine’s developments, like so many other valuable parts of his Handbuck 
der Kugelfunktionen , have been overlooked by subsequent workers in the field. Darwin 
(1901) rediscovered, apparently independently from Heine, and used also numerically the 
expansions of Lam6 polynomials into series of associated Legendre polynomials, but his work 
too was to share the fate of Heine’s. In the modern presentations of the theory of Lam6 
polynomials known to me, one finds only the power-series. Humbert (1926) only mentions 
Darwin’s work, which he thinks to be too involved, and Strutt (1932) remarks that Darwin 
gives a different representation of Lame polynomials, without going into further details. 

When I started on the investigations contained in this paper I was not aware of Heine’s 
and Darwin’s developments. This proved to be a fortunate circumstance, for so I chose 
quite a different approach providing a wider basis from which one could get not only the series 
dealt with by Heine and Darwin, but actually all the different types of series of associated 
Legendre functions representing (polynomial or transcendental) Lame functions. My starting- 
point was the integral equation satisfied by Lame functions, in connection with Ince’s develop¬ 
ments of Lame functions into what he termed Fourier-Jacobi expansions (Ince, 1940 b ). 

Fourier-Jacobi expansions are not quite new in the theory of Lame polynomials either, 
though I did not know this at that time. Heine (1878, pp. 372 and 377) has two different 
types of them, one (of. also Heine, 1859 b, p. 95) closely related to his expansions into Legendre 
functions. Darwin (1901) too used these expansions. Hobson (1892) even discussed such 
expansions in connection with certain transcendental Lame functions. However, all these 
beginnings have been more or less neglected by more recent authors. It is, in my opinion, 
an important step that Ince rediscovered these series and, going beyond his predecessors, 
realised the fundamental importance of the Fourier-Jacobi series in the theory of Lame 
functions, in particular of transcendental Lame functions, which he was the first to discuss as 
of equal standing with Lame polynomials. It is sufficient to recall the analogous develop¬ 
ments in the case of Mathieu’s equation in order to see that the importance of this new type 
of expansion can hardly be overestimated. 

Mathieu’s equation, which may be written in the form 


d*y 

— + sin 2 ^=o, 


(i-i) 


* This paper was written in August 1940 and is published here without any alteration. Since then I have 
investigated similar expansions of the general solution of Lamp’s equation, and also correspondin g expansions 
of solutions of the Heun equation. 



248 A, Erdelyi 

has been studied by earlier writers (Lindemann, 1883; Schubert, 1886) by expanding the 
solutions into series of powers of sin u or cos u. Though these researches gave valuable 
results on the solution of (1.1) with arbitrary values of they failed to give sufficient informa¬ 
tion about the periodic solutions. It is not too much to say that the theory of the periodic 
solutions of (1.1) is entirely based on the representation by Fourier series of these solutions, 
and hence it is only natural to expect that the corresponding Fourier-Jacobi series will prove 
equally successful in connection with Lamd’s equation. 

Also, the introduction of the Fourier-Jacobi series to the theory of Lame functions has the 
advantage that it exhibits the close relation between the theory of Lame functions on one hand 
and the theory of Mathieu functions on the other hand. It is hoped that it will be possible 
to develop a theory of Lamd functions analogous to the well-known theory of Mathieu 
functions. As a first example, Dr Ince himself studied relations between Lame functions 
(Ince, 1940 £, § 6) which are analogous to, and generalisations of, certain relations between 
Mathieu functions. It seems well worth trying to develop the theory of Lami functions 
so far that all important theorems on Mathieu functions shall appear as limiting cases of 
corresponding theorems on Lame functions. 

2. It is well known (Heine, 1878, § 106) that, beside the Fourier expansion of Mathieu 
functions, there are expansions in terms of Bessel functions—Neumann series—which are of 
great importance when dealing with functions of imaginary variable or with Mathieu functions 
of the second kind. These expansions have essentially the same coefficients as the Fourier 
expansions of the functions concerned, thus showing that the two types of expansions are 
intimately connected. This connection may be exhibited by the integral equations satisfied 
by Mathieu functions which convert the Fourier series into the corresponding Neumann 
series. 

In a discussion following a lecture on Lame functions by the late Dr Ince to the Edinburgh 
Mathematical Society, I pointed out that a similar procedure should be possible with Lam6 
functions, and is likely to give new series convergent in domains where the power-series or 
the Fourier-Jacobi series are not convergent. Dr Ince added that it should be expected that 
the new series would be in terms of Legendre functions. From what follows, the correctness 
of both conjectures will be seen. 

The procedure I was thinking of at that time is as follows. Let us take Lame’s equation 
in the form 

d 2 y 

Hy] = +{&-n(n + i)k 2 sn 2 ( u ,, k)}y = o, (2.1) 


where k 2 is the modulus of the elliptic functions; 4K and 2z'K' are the primitive periods of 
sn(#,£)=sn«. For certain values of h, (2.x) has solutions which are periodic mod. 2K or 
4K, and which will be called Lame functions of first kind or Lame functions simply. An even 
Lame function of period 2K, say, 

_y = dn C Sr cos (2?- am u) (2.2) 

r=0 

[Ince, 1940 b (3.3)] satisfies the integral equation (Whittaker and Watson, 1927, § 23.6; Ince, 
1940 a, § 10) 

j-2K 

j/(»)=A p„(i sn u snv)y(v)dv, (2.3) 

J -2K 


where P n denotes the Legendre function of degree n. Substituting (2.2) on the right-hand 
side of (2.3), we have formally 


y(u) = A2, C 2r P n (£ sn u sn v) cos (2 r am v) dn vdt\ 

f— 0 J — 2K 


(2.4) 


The integral appearing here can easily be shown to be a constant multiple of P£ r (dn u). 
Thus expansions of Lamd functions into series of associated Legendre functions have been 
obtained. In the main part of the paper, however, another approach will be preferred. 



Expansions of Lame Functions into Series of Legendre Functions 249 

It will- be seen later that any of the six quantities sn «, cn u 9 dn u, k sn u, ik cn u{k\ 
dn u\k’ may be taken as the variable of the Legendre functions and consequently there are 
six different types of expansions. Also it will be seen that these series converge for values 
of u for which the power-series and Fourier-Jacobi series are divergent, and hence the new 
series are a welcome means to investigate the analytic continuation of periodic solutions of 
Lame’s equation. 

These series have another feature which renders them useful even in connection with Lame 
polynomials where the question of convergence and analytic continuation does not arise at 
all. It will be seen that the series in associated Legendre functions remain still solutions of 
Lame’s equation when PJ is replaced by QJJ or by any linear combination 
Hence we have a second solution of Lame’s equation the asymptotic behaviour of which for 
large values of the variable is readily obtained. 

In the following pages only the general theory of the solutions is given, and no attempt 
has been made to utilise them for numerical work. Also I restricted myself for the sake of 
brevity to solutions of real periods 2K and 4K, though solutions of other real periods are likely 
to be accessible to the same method. 

There are similar developments in connection with the Fuchsian equation of second order 
with four singularities—the so-called Heun equation. 


Preliminaries 

3. For the sake of brevity we shall write 

s — snu, c-cnu, d—drxu. 


(3.1) 


All elliptic functions, unless the contrary is stated explicitly, have the modulus k 2 . When 
Lam6 functions appear in connection with potential problems (Whittaker and Watson, 1927, 
§ 23.5), ^ is real and between o and 1. In this case we may put ^=sin 9 , where the acute 
angle 6 is the modular angle. The complementary modulus, k' = VC 1 * s equal to cos 6 
in this case. 

There are the well-known formulae (Whittaker and Watson, 1927, §§ 22.11 and 22.12) 

s'-cdj c'=-sd, d'= -Psc, (3* 2 ) 

ds 

where, for instance, • There is also the relation (Whittaker and Watson, 1927, § 22.11) 

au 

d 2 = i- k 2 s 2 - £' 2 + k 2 c 2 « + k' 2 s\ (3*3) 


In the following sections z will denote any of the six quantities s, c, d, ks, ikc\k\ dfk*. 
In these cases (1 -z 2 )% shall be determined uniquely by the relations 


(1 - s 2 )* = c, (1 - c 2 )% =r, (1 - d 2 )% = jks, (1 - k 2 s 2 )% = d, 
(1 +k 2 c 2 lk'*)l=dl# i (1 - d 2 jk' 2 )% = ikcjk'. 


(34) 


which are in agreement with (3.3). 

4. Ferrer’s definition of associated Legendre functions, 


P »(*) «(!-**)*« 


ggnCj) 

dz m ’ 


Q^) = (x-0 2 )im 


dfQJf) 

dz m ’ 


(4-l) 


will be adopted, m shall be zero or a positive integer, while n is unrestricted. This is the 
definition adopted by Whittaker and Watson (1927) for real values of z between -1 and +1. 
Having defined (1 -z 2 )% uniquely by (3.4), and m being an integer, we may use this definition 
of the associated Legendre functions without any ambiguity for all values of z. Hobson 
{1931) in his standard work uses a different definition. For — 1 < z < 1 and integer m his 
definition differs from (4.1) in a factor (-) m . 



250 A. Erdelyi 

We have (Whittaker and Watson, 1927, § 15.51) 


pm' _ / _ z z\-$pm+l _ m Z_pm 

n ~ dz K J ” I-/”’ 


(4.2) 


denoting by dashes differentiation with respect to 5 and omitting the variable from the symbol 
of P™ Also we have (Hobson, 1931, p. 291; the different sign is a consequence of the 
different definition of P™) 


2mz(i - 02)-*pm = pyw + (« - m 4- i)(* 4- »i)PJ- 1 . 
From (4.2) and (4.3) we obtain 

0Pf= 2(1 - S 2)-i{p»+i + ^ (w _ i)<i -^)-ip»} 2 p: 

1 z 


I — 


I+S 3 


'Up: 


(4.3) 


= z(i -+ i)P™ +1 + l(x-m + 1)(* + m)(m - i)P™ -1 }- 

I — 0* 

and applying once more (4.3) to (m + i)P™ +1 and (m - i)P™ -1 respectively, 

^=ipr a +i|«(«+i) U 2 } p » 

4 - l(n — m +1 ){n-m + 2 ){n + m~ 1 ){n 4- m)^~ 2 . 

Similarly 

= z( 1 - 4 '{m- 2 )(m ~ i)mz(i - z 2 )~i P™}--( m 2 + 2)niP™ 

- «(i - * 2 )"*{K^ +1)(« + 2)P? +1 

-410^ - 2){m - i)(« - m + i)(« 4 -ttz)P™~ 1 }-;(/?Z 2 + 2)mP® 

and hence 


(44) 


(4.5) 


(4.6) 


(4-7) 


3 ^Pf = li m + 2)P ^ +2 + n(n +1) - + 2 ) j 

+ J(» - 4-i)(* - w 4- 2)(« 4 - ^ -1)(^ + m){m - 2 )P^“ 2 . 

Also we have the differential equation 

_2? T>*'_/*(« + !) Urn 

* i-a 2 n \i-* 2 (1 — z 2 ) 2 J ^ n * 

(4.4), (4.5) and (4.6) are still valid for functions of negative order defined by 

P~ m = ( ~ m + I ) p OT , 

T{n J rm J rl) 

All properties of P“ mentioned hitherto are shared by the Legendre functions of second 
kind, Q“ 

Further, we shall employ the addition theorem for Legendre functions (Whittaker and 
Watson, 1927, § 15.71) 

P„{^ + (i-^)i(i-^)i cos w} -)” ^ ( ( ”-~^^ TOP?(/) cos mo, (4.8) 

and the formula obtained by differentiating (4.8) with respect to co, 

(1 - z 2 )i(i - /®)4 sin o>P'{z*+(1 - z 2 )i(i - fi)l cos u>} 


(4-9) 



Expansions of Lami Functions into Series of Legendre Functions 251 

In order to discuss the convergence of our series we need the asymptotic representation 
of associated Legendre functions for large (positive integer) values of m . We have [Hobson, 
1931, p. 188 (11)] 

to =(- 

T(n-m +1) 

T(n + m + 1) 


-(-> 


-I F[ -n, n + i; m + i; 


V(n-m + i)T(m + i) \x+zj 
and hence for large positive integer m, 

Also [Hobson, 1931, p. 201 (26)], 




1 -zsh" 


(4.10) 


Q“(«)=(-) m; 


T(n +m + x)Y(m - n) 

T(m + 1) 


{G~ 


+ z\% m I 1 +z 

— j F l -n, n + i; m + i; — 

' 2 


*/ 




I lz£\ 


fan 


FI -n, n + i; m + i; 


x - 
2 




\i+V 

where the upper or lower sign has to be taken according as the imaginary part of z is positive 
or negative. Hence 




T (n + m + x)T(m - n) 




:+z) 




( 4 .II) 


It is worth while remarking that all the recurrence formulae used in this section are satisfied 
by P™( - z) as well as by P™(s), and if n is not an integer, these two solutions are different 
from each other. Q™( z ) is a linear combination of the two, namely that linear combination 
which behaves for large values of | z | like const. z~ n ~ x . 


Recurrence Relations. Convergence of the Series 
5. It will be seen that Lamp’s differential equation has solutions of the form 

y = 2 x m ro -m+ I)TO (s-i) 

m= 0 

and 

y =feST 2 ~ m+ *)*TO> C5.*) 

\ 1 * / rn=l 

where ac is some constant depending on ^ alone and the coefficients X w and X m depend on 
h ,, k and n . 

Substituting (5.1) in (2.1), we shall arrive at the relation 

00 

2 XJ\* - m + i){PrV) + <* ~ *»*)TO 

+ (n-m + i)(k - + 2)0* + m -1)(«+w)P”' 2 (z)} = o, (5.3) 

which has to be satisfied identically in z. Here a and B are constants depending on k, k and 
n, but independent of m and z. (5.3) may be written 

2 x m _ 2 r\* - m+ 3 )P“ •+ 2 -»+ 1 )<« -- ^ a ) p » 

m=2 t»=sO 

00 

+ 2 x m+2 r(«-»?+x)(»+*»+i)(*+««+2)Pn= o - 

m— -2 

In the first two terms (m = - 2 and -1) of the last sum we use (4.7) and then write the last 
•equation in the form 

OP 

2IX** — nz + 3 ){€ m _ 2 X m _2 ~ ~~ a m ( 2 ) ~ 

0 


( 5 * 4 ) 



252 

Here 


A. Erdilyi 


€_ 2 = €_! = o, e 0 = 2 and e w = i when m > i; 
(n + m + i){n + m + 2) 


and 


$m-2 ~ 


hrfi-t 


in - m + x)(n -m + 2) 


i-m + i)(n - m + 2) 

b - a 

when m^x while p_ x - ■ +1, 


CS-5) 


! + l) 


In order that (5.4) shall hold identically in z , {X w } must satisfy the system of recurrence 
relations 


€ m— 2 ^m —2 a m-^m+2 

§ 

II 

O 

M 

to 


( 5 - 6 ) 

Obviously (5.6) consists of two independent systems of recurrence relations, namely 
o = /J_ 2 X 0 + a 0 X 2 , 2X 0 = ^0X2 + tt 2 X^, 


X 2 m-2 “ $ 2 m- 2 ^ 2 m + a 2mX 27n+2 
for the coefficients with even subscript, and 

o = ^_ 1 X 1 + a 1 X 3 , 

0 = 2, 3, 4 , • 

• •) 

( 5 - 7 ) 

X 2m _i = X 2?n+l + a 2m+lX 2m+ 3 

0 = 1, 2, 3, • 

• •) 

( 5 - 8 ) 


for the coefficients with odd subscript. 

Similarly, substituting (5.2) into (2.1), we shall arrive at a relation of the form 

00 

2 X^r(* - m + x){(» + 2)P“ +3 (2) + («' - 

+ in - m + x){n - m + 2)(n + m - j)(n + m){m - 2)P™ “ 2 (z)} = o, 
and this leads to the recurrence relations 

o^x'+c^x;, 0 «$x;+a 2 x; 

^m-2 ~ $m - a 7wX w + 2 (w = 3 , 4 , 5> • * •)? 

where 

V m 2 - a’ 

0 = 2, 3, 4, • • *)• 


^ 0m- S =7 


and 


A ™~ 2 {71- m + i){n - m + 2) 

(5.10) consists of two independent systems of recurrence relations, namely 

o =P'-iK+«iK 

+ l + a 2m+lX-2 m +a (tn = I, 2, 3, . 


o= J e;x;+a i! x' ) 

^2m - 2 P$m - “f" X 2n j -i- 2 


0 = 2, 3, 4, 


•), 

•), 


for coefficients -with even and odd subscripts respectively. 

6. From (5.6) the infinite continued fraction (Perron, 1913, § 57) 


(5-9) 

(S-io) 

(S-n) 

(5-12) 

(5-i3) 


X-m __ 1 a m a m+2 . . 

X m _2 fim-2 "b Pm "h An+2 + 

follows. Since lima m = - x and lim fi m = b (m —*■ co), this continued fraction is convergent, 
at least for sufficiently large values of m (Perron, 1913, § 56, Satz 41), whenever the quadratic 
equation p z -6p + i=o has two roots of different moduli; lim X m /X m _ 2 (m-+ 00) is equal to 
that root of p 2 -^p + x=o, the modulus of which is smaller. Hence the infinite continued 
fraction (6.1) is convergent unless 5 is real and - 2 < b < 2. Also 


whichever has the smaller modulus. In particular, if l is real (and 2 * > 4), then in (6.2) the 
sign opposite to that of l must be taken. 



Expansions of Lami Functions into Series of Legendre Functions 253 

The system (5.7) is consistent and lim X m =o (772—> 00) only if the expression derived from 
(6.1) for Xg/Xo is equal to the value of this quotient as given by the first equation (5.7). 
Hence the condition of consistency 


o 5 ? 0,4 

^ A.+&+&+-°* 


( 6 - 3 ) 


which is a transcendental equation for h. 
Similarly from (5.8) the equation for /2, 


CC_i Og 05 

_1 ft + ft + ft + 


( 6 . 4 ) 


is readily derived. 

Similar considerations in connection with expansions of the type (5.2) show that the 
infinite continued fractions arising from (5.10) are convergent whenever the two roots of the 
quadratic equation p ' 2 - b'p +1 = 0 have different moduli, i.e. whenever V is either complex, 
or real with b ' 2 > 4. Also 


lim 

m~>co 


K 

x ;_ 2 


=H^±V(^' 3 - 4 )}, 


(6.5) 


whichever has the smaller modulus. 

The system of recurrence formulse (5.12) is consistent, and lim X' m = o (m -> <*>) if h is a 
root of the transcendental equation 


o' , 2l <*5 


= 0 , 


( 6 . 6 ) 


and the system of recurrence relations (5.13) has solutions different from zero if h satisfies 
the transcendental equation 


0/^2 ^4 ^6 


(6.7) 


All these conclusions hold when 72 is not an integer. When n is an integer, it may be 
taken to be a positive integer. In this case there are polynomial solutions in which m ranges 
from zero to n , and transcendental Lame functions [Ince, 1940 a, § 2 (ii)] with which m ranges 
from 72 +1 to infinity. Though P™ vanishes when m and n are both not negative integers 
and 772 > 72, the product T(n ~m + i)P™ tends to a finite limit when n tends to a positive 
integer < 772. For these transcendental Lam6 functions (6.2) and (6.5) still hold for integer n, 
though the equations of consistency are modified (see Ince, Lc). Ince also discussed the 
question of common roots of two equations of consistency (1940 b, § 5). 

In the rest of the paper it will sometimes be assumed that n is not an integer. The results 
for integer values of n then easily follow by a limiting process. 

7. Turning to the question of the convergence of the series (5.1), it is seen from (4.10) 
that for large m 7 


X m T(n - 772 +1 )P£(«) ~ (-) m X 


Y{n + 772 + 1)/ 
m Y(nz +1) \i +2/ 


and hence with m tending to infinity, the ratio of the moduli of the 772th and (m - 2)th term of 
(5.1) tends to 


I — z 

lim 

x m 

1+Z 

m— >co 

X m _ a 


=i|£±V(^ 2 - 4 ) 


1 -z 
1 + z 


(7.1) 


Hence the series is convergent when 


1 -z 
1 +z 


< | 3 ±V ^ — 4 ) 1 — 4 ) 1 . 


<;■*) 



254 


A. Erdelyi 


In the last expression that sign must be chosen which makes ^ \ bT *\/(b* - 4) I larger. This 
larger value always exceeding unity, the domain of convergence in the 5 -plane will be the 
finite part of the plane outside the circle C 2 with centre 


and radius 


Together with (5.1), 


2+ |£=fV 0 * 2 - 4 )| 2 

2i_ 2- |*=fV(* 2 -4)I 2 

4l*=FVff»- 4 )l 

ri |<5=FV03 2 -4) | 2 -2* 


y = X x « r (« ~ m + I ) p »( ~ *) 

m=0 


( 7 - 3 ) 


( 7 - 4 ) 


( 7 -S) 


is a solution of Lame’s equation, distinct from (5.1) if n is not an integer. This solution is 
convergent outside the circle C 2 of the 5-plane with centre 5 2 = -z 1 and radius r 2 =r v 
Another significant solution is 

(7.6) 

777 = 0 


convergent in that part of the 5-plane which is outside both Q and C 2 . 

The same results hold for (5.2) and the associated series, b being replaced throughout 
by b\ 

It is worth while noting that all series are convergent on the entire imaginary axis of the 
5-plane. 


Solutions in Series of P™(</) and P 
8. Besides (2.3) there is also the integral equation 

r+2K (ik \ 

y(u) = Aj PJ ^7 cnucnv )y(v)dv 

(Whittaker and Watson, 1927, § 23.6), which is satisfied by certain Lam6 functions, 
putting in (4.8) 

<77 

/ = o, 5 = a, (1 -z 2 )i=ks, <j) = - - am v y 

2 

we have the expansion of the nucleus of (2.3) 


(8.1) 
Now, 

(8.2) 


P n (ks sin am v) = 


“ ; r(*+»+i) 


(°)P«(^) cos w(|7r - am z>). 


Similarly, 

yields for the nucleus 


t=o, (1 -z*)\=ikc\k\ 

of (8.1) the expansion 


co= am v 


(8.3) 


P„(^ cos am «/£') = 2 ( t ^ P”(o)P”(^') cos (« am p). (8.4) 

m— -00 + 

Inserting (8.3) and (8.4) into (2.3) and (8.1) respectively, we obtain formally the expansions 
7(«) = A 2 ( - )’r^^P»(o)PMJ cos w(j7T-am (8.5) 

and 

y{u) = AE( -) —~JJ _cos (m am v) dv, 

representing certain Lam£ functions. 


( 8 . 6 ) 



Expansions of Lame Functions into Series of Legendre Functions 255 

Again, there is a slightly different type of integral equation (Lambe and Ward, 1934, 
equation (4-16); Ince, 1940 a } § 10, III), 

y{u) = Aj ^ cn u cn v¥' n (k sn u sn v)y(v)dv. (8.7) 

In order to obtain the expansion of the nucleus of this integral equation, let us put in (4.9) 
the values (8.2), thus obtaining 

JP(n _ m -f“ 

ks cn vV’ n (ks sn v) = 2 ( , P”(o)P^)ot sin - am v). (8.8) 

772 =s= — CO ' ' 

Introducing this into (8.7), the expansion 

^ J"V^ _j_ j\ /»2E 

■K*) = - ) m p(^ J sin »2(^r - am »)</» (8.9) 

is easily obtained. 

It is not necessary, at this stage of the work, to enter into questions of convergence or to 
discuss which type of Lame functions are represented by the above series. 

9. The expansion (8.5) suggests trying a solution of Lamp’s equation in form of the series 

X«) = 2 A mT(n - *» + (9-1) 

J »*0 

In order to introduce this series into Lame’s equation, let us first compute 

%= - - m + i)P” V) 

UU 771 = 0 

and 

d 2 y “ 

^ = 2 - «+i)W(i) - pj ( c * - 

771 = 0 

= 2 A «rc» - »+i)|^vp:V) - («(*+i)*v - ^)w}. 

using (4.6). Hence 

L[y] = 2 , A m r(« - m + + (A - n(n + i)k* + 

m~0 

and, employing (4.4), 

Lb] = 2 A mIV -m + i)ft»PJ + ty) + \\*h - n{n + i)B - (2 - k*)m* ‘]P ”(<0 

771=0 

+ w +1)(^ - m + 2){n + m- 1 ){n + #z)P™“ 2 (//)} — o. (9.2) 

This is a relation of the form (5.3), and hence for the coefficients we obtain the recurrence 
relations (5.6) with 

X m =A m , a=4h]£ 2 -2n(n + i), b = 2(2/A 2 -i). (9.3) 

From the results of § 5 it follows that a solution of the form 

SA 2r r(«-2r + i)P 2 V) or %A, r+i r(n-2r)?l r+1 (d) 

exists if h satisfies the transcendental equation (6.3) or (6.4) respectively. The coefficients 
of these expansions are given by the recurrence formulae (5.7) and (5.8) respectively, always 
with above values of a and b. 

The continued fractions connected with these equations are convergent unless b is real and 
- 2 <b < 2, z\e. unless A 2 is real and > 1. Hence our solution is valid in the whole plane of 
the complex parameter n; in the plane of the complex parameter h 2 we have a branch-cut 



256 A. Erdilyi 

from 1 along the positive real axis to infinity. According to (7.2), the series is convergent 
in that region of the complex variable u in which 

1 — d ^ 2 2E 

7 +d < 

whichever sign gives the larger value. 

In the applications the most important case is that in which k = sin 6 , k r — cos 0 , and the 
modular angle is (real and) acute. In this case the domain of convergence is 

< cot 2 J0. (9*5) 

The solution suggested by (8.6) will be taken in the form 

y{u )=2 Kn*-™+ i)p win (9-6) 

m —0 

In this case we obtain 

Lb] - - 2 A ‘ r O -*• + i){^ 2 PrW) -*[a h -n(n + i)P - (2 - k*)mWn(d\k') 

+ \&{n - m + i)(n - m + d)(n + m - i)(n + m) P“" 2 (a?/i')}. (9.7) 

Hence the coefficients satisfy the recurrence formulae (5.6) with 

X m =A*, a = zn{n +1) — 4^/£ 2 , 5 = 2(1 - 2//S 2 ). (9.8) 

The continued fractions are again convergent in the cut £ 2 -plane. The domain of convergence 
in the w-plane is different, being determined by 


k’-d 

Idz k' 

A’ + d < 

k 


or 

< cot 2 \d (9.10) 

respectively, according to whether k is complex or real (and a proper fraction). 

10. The solution of (2.1), 

.T =77 2 A 'nF( n ~m + i)»*P“(a?) (io.x) 

m —1 

is obviously of the form (5.2) with z = /c = £' -2 , and is suggested by (8.9). We have 

fu = %™ A ™ V{n ~ m + l} {“~ 

and 

d ~£i " SWIf (* - m + + ai*)P!+ £ - - P?(<f)j 

= Ys {n~m + i)^ 3 &dP%\d) - £«(» + i)k 2 c 2 - & 

using (4.6). Hence 

L I>]= J s 2- » + T)^dPf(d) + ^ + *» - n(n + 1)& + 2 -^ + «*£jpj(<*)J, 
or, employing (4.5), 

L[> ,] = 0 = _ 2 A;i \n-m + + 2>P£+V) + [A - \n(n + i)& - J( 2 - &)m*]m-p™(d) 

Bis 

+ ik\n -m + i)(n ~m -f 2 )(n+m -1 ){n +m)(m - 2)P^“ 3 (^)}. (10.2) 


k r +d 


1 - d 
1 + d 




357 


Expansions of Lame Functions into Series of Legendre Functions 

This is of the form (5.9) with 

X i» = Aw a' =4hlk*-2n(n + i), b' = 2(2/^ - x), (10.3) 

and hence we obtain recurrence formulae (5.10) for these values, b' having the value of b 
of (9-3), the series (10.1) and the continued fractions related to it are convergent in the same 
domains of n, k, u as the series (9.1) and the corresponding continued fractions. 

Again there is another solution, 

y = jkc ~ m + ^ wP » ! (|) (i°-4) 

with 

X m = A m> a' = 2n(n+i)-4b/k 2 , b' = 2(1-2/^). (10.5) 

The convergence properties of this series are identical with those of (9.6). 


Series of P™(4 P«(*), P %(c) and ’BRitc/i') 

11. Besides the integral equations used hitherto there are other integral equations satisfied 
by Lame polynomials (Whittaker and Watson, 1927, § 23.6). They suggest trying instead 
of (9.1) and (9,6) the solutions 


00 



y = 2 i B m T(n-m + i)¥:(s) 


(ix.x) 

and 

eo 




771 = 0 


(11.2) 

Proceeding in the same way as in § 9, we obtain the recurrence formulae (5.6) with 


or 

X m = B m , a = 2 n{n +1) - 4[*(« +1) - 

b — 2(1 — ajk'^) 

(11-3) 

X TO = B*;, a=4[»(» + x) - b]/k ' 2 - 2 n(n +1), 

b — 2(2!— 1) 

(11.4) 


respectively. 

The continued fractions originating from these recurrence formulae are convergent unless 
k ,% is real and > 1. Hence in this case the branch-cut in the i 2 -plane goes from o along the 
negative real axis to - co . Hence this type of expansion will be still valid for k=i where 
the expansions of §§ 8-10 cease to have a meaning. 

The expansion (11.1) is convergent in that domain of the complex plane u in which 


I ~s 


I ±k 

I+J 

< 

k' 


< COt 2 (J7T - 


(«-S> 


the former formula being valid for complex with that sign which yields the larger modulus 
of 1 ±k, the latter for real and positive acute modular angles. The expansion (in2) is con¬ 
vergent in the domain 


l~ks 


I ±y£ 

I +ks 

< 



1 -ks 
1 +ks 


< cot 2 (Jtt -\d) 


(u.6) 


of the complex af-plane. 

Similarly for the coefficients of the solutions 

y = - 2 “*» + 1 )^ P «( J ) (11-7) 

£ 772 = 1 


y*=~ d 2 B m' r (* - « + 

a 777 = 1 * 


<ii.8) 


and 



258 


A. Erdilyi 


the recurrence relations (5.10) hold with 

x;=b;, d-a, v-b (11.9) 

from (11.3) and 

= a'~a, b'=b (xi.io) 

from (11.4) respectively. The convergence properties of these series and of the corresponding 
continued fractions are identical with the convergence properties of the series (11.1) and (11.2) 
and of the corresponding infinite continued fractions respectively. 

Plainly the recurrence formulae for the coefficients of series into PJ(As) originate from 
the corresponding recurrence relations for the coefficients of series into P™(d) by changing 
k and h into k' and n(n +1) - h respectively. The same change leads from the coefficients of 
series into P ™{dlK) to those of series into PJJfts). 

12. Finally we have to deal with the last quadruple of solutions which again may be 
obtained by a transformation of Lami’s equation. Let us assume the solutions 


and 


00 


y=’ZjC m T(n-m + i)P“(c), (12.1) 

m~ 0 

= X C*r(« -m + i)¥™(ikcjk'), (12.2) 

m=0 

y=*~. jr c;i> -m+ i)«p”(4 (12.3) 

J m=1 

s 00 / ikc\ 

+ ). (12.4) 

m—1 \ / 


Inserting these series into Lamp’s differential equation, an analysis similar to that of 
§§ 9 and 10 yields for the coefficients C m , C* and C' m , C*' the relations (5.6) and (5.10) with 

X m = C„„ x;=c;, a = a'=4k-2n(n + i), b=b' = 4P-2 (12.5) 

and 

X m = c x;=C*', a - a 9 = 2n(n +1) - 4 h, 2 ~ 4k 2 (12.6) 

respectively. 

The infinite continued fractions connected with the recurrence formulae for the C m and 
C are convergent unless k 2 is real and a proper fraction . Hence just in the case which is 
most important in the applications, the series of this section present new difficulties. Thus 
they are not likely to be of much use except in connection with the polynomial solutions of 
Lame’s equation where the question of convergence does not arise. 

For complex k (or real k > i or < -1), however, the continued fractions are convergent 
for any value of n. The series (12.1) and (12.3) are convergent in the domain 

< \k±ik'\ 2 , (12.7) 


1 -c 
1 +c 


whereas the series (12.2) and (12.4) are convergent in the domain 


k r - %kc 
k r + ike 


< \k±ik' [ 2 


(12.8) 


of the complex &-plane. In both cases that sign has to be taken which assigns to \k±ik r | 
the larger value. 


Lam£ Polynomials 

13. Each of the twelve types of series gives rise to two distinct types of solutions, one 
with even, the other with odd subscripts m of the coefficients. So, for instance, from (9.1) 
we have the two solutions 

OO QQ 

J = X A »rT(«-2r + i)Pj , (^ and y = 2, A^Vin - 2r)V% +1 (d), 

T =0 1=0 

which will be distinguished as (g.i)e and (9.1)0. 



Expansions of Lame Functions into Series of Legendre Functions 259 

First we shall deal with the simplest type of Lame functions, which are known to be poly¬ 
nomials of degree n in c, d. These Lame functions, generally called Lame polynomials , 
have some features of their own. II n shall denote any polynomial of degree \n in s 2 or c 2 
or d 2 , and n shall be a positive integer in this and the following section. 

In the case of Lame polynomials , the coefficients of our series vanish whenever m> n. 
Hence the series terminate as well as the continued fractions associated with them and the 
question of convergence does not arise at all. No restrictions on k or u are necessary, and any 
of the twelve types of solutions may be used. 

Corresponding to a given integer value of n > o, there are in general exactly 2 n -f 1 Lame 
polynomials corresponding to as many different values of h. There are eight different types 
of Lamd polynomials (Whittaker and Watson, 1927, § 23.5). For even n we have Lame 
polynomials of the first species, of form U n , and of the third species, of forms ^II n _ 2j 
dsTi n „to or ^n n _ 2 . For odd values of n we have Lame polynomials of second species, of 
forms cU n ^, or dU n . _ 1} and of fourth species, scdli n _ z . 

It will be seen that each type of Lame polynomials may be represented in six different 
ways by a terminating series of associated Legendre polynomials. 

Take for instance (9.1)2. It is of form II n if n is even, while it is of form dJJ^ if n is 
odd. Correspondingly (9.1)0 is of form sdH n __ 2 if n is even, and of form ^ 11 ^ if n is odd. 
Dealing similarly with all twelve types, we obtain the following table of representations of 
Lame polynomials:— 


n 

Form 

Represented by 

Even 

| 

n„ 

(q.l)e 

( 9 .6)e 

(u.i)e 

(11.2)2 

(12.1)0 

(12.2)2 



(9.1)0 

(10.4)0 

(11.1)0 

(11.2)2 

(I2.l)0 

(12.4)0 

Odd 

rn B _! 

(9.6)0 

(lO.l)0 

(11.1)0 

(11.8)0 

(l2.l)2 

(12.2)2 



(9.l)e 

( 9-6)* 

(11.2)0 

(11.7)0 

(12.2)0 

(12.3)0 


cdn n _ 2 

(9.6)0 

(lO.l)0 

(n.7> 

(11.8)2 

(12.2)0 

(12.3)0 

Even 

dsl I,,— 2 

(9.1)0 

(104)0 

(11.2)0 

(11.7)0 

(12.3)0 

(12.4)0 


ScU n ^ 

(10.4)2 

(I0.l)2 

(ii.i> 

(11.8)0 

(I2.l)0 . 

. (12.4)0 

Odd 

scdn n _ z 

(I0.l)2 

(lO- 4 > 

(u.7)e 

(11.8)2 

(12.3)* 

(12.4)0 


14. Each of the double periodic Lame functions can be represented by six terminating 
series of associated Legendre polynomials P™(s), 0 being x, 2, d, As, ikcjkdjk f respectively 
in the six cases. There are certain interesting relations between two different series repre¬ 
senting the same function. 

Take for instance (9.1)2 and (9.6)2, i.e. 

M M 

*L& ir T(n-2r+i)'?%(d) and 2,A£T(« - 2 r+i)P*( 4 $')> 

r=0 r=0 

both representing solutions of the form IT n if n is even, and of the form dIL n _ 1 if n is odd. 
The corresponding values of a and b differ only in sign, and hence the a m are the same in 
both cases while the differ only in sign if m*i (/L x is essentially different in both 
cases). Hence we obtain the' same (terminating) continued fraction (6.3) in both cases and 
consequently the same values h. 

Taking the same value of k in (9.1)2 and (9.6)2, we obtain except for a constant factor 
the same periodic solution, for to each value of h there belongs only one periodic solution 
(Whittaker and Watson, 1927, § 23.47). Hence 

+ i)P*( 4 ) = A 2 A£T(« - 2r + i)P*<^ 0 , (14-O 

where X is a constant. 



260 


A. Erdelyi 

Furthermore, it is easily seen from (5.7) that, the recurrence relations for A 2r and A* 
differing only in the sign of jS m _ 2 , we may put A* r =(-) r A 2r . Hence instead of (14.1) we 
obtain the remarkable relation 

SA *T(* -2 r + r)P »(J) = AX( -)'A*r(* -2r + i)P*W), (14.2) 


in which A is a kind of a characteristic number. 

There is no corresponding relation between (9.1)0 and (9.6)0. The extra term +1 in 
/Li makes the equations (6.4) different in both cases, and hence (9.1)0 and (9.6)0 do not belong 
to the same value of k . In fact, from the table of § 13 it is seen that they represent functions 
of different types. 

It is of course not surprising at all that a terminating series of Pfjf(d) should be 
representable as a terminating series of P ^(d/E), because all terminating series of ^(d) 
have this property. The particular feature of Lame polynomials is that the coefficients of the 
two representations differ only in a factor (-) r A. I believe that this property characterises 
Lame polynomials of first species (n even) and of second species of form dH n _ x (n odd) as 
completely as does the integral equation (Whittaker and Watson, 1927, § 23.6) 


f2K 

y(u) = X\ I 
J -2K 


—dn udnv ]y(v)dv, 


(i 4 * 3 ) 


I do not see, however, any simple way of deducing this integral equation or Lame’s differential 
equation directly from (14.2). 

Similarly we obtain five more relations: 

XB 2 r r(« -2 r + i)P *(s) = AE( - ) r B 2r rO -2 r + 1 )?f(ks) (14.4) . 

satisfied by Lame functions of first species (n even) or second species with factor $ (n odd), 
SC 2r I> - 2^4 i)P*(*>- A 2 ( - )'C*T(* - 2^ + I)P »(ik*lX) (14.5) 

for Lame polynomials of first species in even) or second species with factor c (n odd), 

—SA^T(« -2 r + i)r£%{d) = A^2( - ) r A^T(» -2 r + i)r2^{djk') (14.6) 

for Lame polynomials of third species with factor sc {n even) or of fourth species (n odd), 

-2 r + = A^E( - ) r B^T(« -2 r + i)rB 2 n r (ks) (14.7) 


for Lame polynomials of third species with factor cd (n even) or of fourth species (n odd), 
and finally 

-2 r + i)rPf( c) = A^S( - YC’^ln - 2\r + 1 yP*(fAjX), (H- 8 ) 


which is satisfied by Lame polynomials of third species with factor sd or by Lame polynomials 
of fourth species according as n is even or odd. 

IS- A few more remarks may be brought forward regarding Lamd functions of the second 
kind, i.e. a second solution of Lame’s equation. It has been mentioned already that this 
second solution cannot be periodic (Whittaker and Watson, 1927, § 23.47) when the function 
of the first kind is a polynomial. 

It has been observed in § 7 that in all our series P”(z) may be replaced by P“(-s). 
This replacement, however, does not yield new solutions in the case of Lame polynomials, 
for P«( - 2) = ( - ) n_m P”(r) if m and n are integers and o < m < n. 

The functions of second kind corresponding to Lame polynomials are obtained by replacing 
P»(*) in all our series by Q“(*). The associated Legendre functions of first and second 
kind satisfying the same differential equation, recurrence relations, etc., the series remain 
solutions of the differential equation and clearly they are the solutions of second Knd 



26 i 


Expansions of Lame Functions into Series of Legendre Functions 

Proceeding, thus, for every function of the second kind six different representations are 
obtained, and it must be proved that the functions represented by the six different series are 
constant multiples of each other, i.e. that (14.2) and (14.4X14.8) are still valid if P*(*) is 
replaced everywhere by Q£ r (s). The proof is very easy indeed. 

Suppose for instance that the series obtained by replacing PJ by in (9.1)0 and 
(9.6)2 are not constant multiples of each other. Then they are two linearly independent 
solutions of (2.1), and hence (9.1)2, for instance, a solution of (2.1), must be a linear combination 
of the two series mentioned. Now, QJ(s), and hence any terminating series of these functions, 
behaves like const. z~ n - x when z tends to infinity. Thus a linear combination of two solutions 
in Q-series vanishes when u = z'K', and could not be possibly a polynomial in d. All six series 
of Q-functions represent, apart from a constant factor, the same solution: that one which is 
known as Lame function of the second kind (Whittaker and Watson, 1927, § 23.47). 

The table of § 13 is also a table of the functions of second kind, giving, 2.^., in its 
first line the series in which (n being even) P“(*) has to be replaced by Q%(z) in order to 
obtain the Lame functions of second kind and first species. 

Transcendental Lam£ Functions 

16. Besides the 2n + i Lame polynomials there is an infinity of transcendental Lame 
functions for any non-negative integer n (Ince, 1940 a, p. 48). For non-integral values of 
n we have only transcendental Lame functions (ibid., p. 49). We proceed now to the 
discussion of the expansions of transcendental Lame functions into series of associated 
Legendre functions. 

Most of what follows is still valid for complex n. For n= -\+ip (p real), for instance, 
we obtain Hobson’s functions of the elliptic cone (Hobson, 1892) developed into a series of 
Mehler’s conal harmonics. For the sake of brevity, however, and in order to fix the ideas, 
we shall assume throughout the rest of the paper that n is real and that >6 = sin 6 is real and a 
positive proper fraction. Without further loss of generality we may take n to be 

It is important to bring out the fundamental difference between polynomial and trans¬ 
cendental Lam6 functions. 

(i) Lame polynomials are divided into four species and eight classes (§ 13). No such 

distinction is possible with transcendental Lame functions (Ince, 1940 c). This 
was the reason for designing a new notation for Lam6 functions of real period 
by Ince (ibid.). 

(ii) Lam6 polynomials are doubly periodic functions. Transcendental Lame functions 

have only one primitive period , this being real, imaginary or complex, as the case 
may be. 

17. (i) Ince (1940c) found every periodic Lame function with the factor dau to be identical 
with a form without that factor, and pointed out that therefore, from the analytical point of 
view, there are only four distinct types of Lam6 functions. 

Ince’s argument is as follows. The solution 

y = S A 2r J 2r (17.1) 

should be classified as belonging to the first species. He proves this solution to be convergent 
for | ks | < 1, hence certainly in some domain in the vicinity of the real axis of u. On the 
other hand, 

y = dI l C 2r s 2r , (17.2) 

convergent in the same domain, should be classified as belonging to the second species. Now 

d=(i -£V)i=2<v 2r 

is convergent in the same domain, and hence (17.2) changes to 

y = So r s a ’-S(V ar = 

and this is clearly of the first species, in contradiction to our previous assertion. 



262 - 4 . Erdelyi 

A similar remark applies, we may add, to the factor sn u when & = K.+2/ and t is real. 
In this case the most suitable representation of functions of the first species will be 

y= EB 2r ^ (i 7 - 3 ) 

instead of (17.1). This series is convergent for | d | < 1, and hence certainly in some domain 
enclosing the line & = K-f it. Also the series 

ks = (i-d 2 )t=62i8 r d* r 

is convergent in the same domain. Hence the solution 

y=sHD 2r d 2r , 

apparently belonging to the second species, is equal to 

2 / 3 r d* r 2 D 2r d 2r =2 B' 2r d 2r , 

and hence it belongs to the first species as well, for the last power-series is clearly of the 
form (17.3). 

18. (ii) We shall examine the periodic properties of the power-series representations of 
Lame functions as well as that of their Fourier-Jacobi expansions and of their expansions 
into series of associated Legendre functions. 

Let us begin, for instance, with the power-series and take as an example (17.1), convergent 
in the domain ] ks | < 1. Clearly (17.1) has the real period 2K, for it is possible to proceed 
from any point u inside the domain of convergence to the point u + 2K, remaining throughout 
in the domain of convergence. Hence (17.1) as a uniformly convergent series of functions 
of period 2K has this period. 

It is entirely different with the imaginary period. A superficial inspection might suggest 
that (17.1) has the imaginary period 2zK', for all its terms have this period. This is, however, 
not conclusive. We know (Whittaker and Watson, 1927, § 22.34) that 

sn (o + z/EL') = (£sn cr)” 1 (/=±i, ±3, ±5,. . .), 

and if a is real, j sn a ) <1* Hence 

[ k sn (or +ir'k') | = j-~i > 1 (r’= ±i, ±3, ±5, . . .)• 

Thus the lines u-a + ir f k r (r f =±i, ±3, . . .) lie outside the domain of convergence touch¬ 
ing its circumference in the points u^rK + ir'k' (r, r'—± x, ±3, . . .). These lines are 
imsurmountable barriers. There is no way of joining any point u to the point u + 22K' 
or u +42 K' by a path entirely inside the domain of convergence. 

Next let us examine the Fourier-Jacobi series of the same solution (17.1), viz. 

y = SA 2r cos (2 r am u). (18.1) 

It is hardly necessary to say that in (9.1), (17.1) and (18.1) the same symbol A 2r has different 
meanings. (18.1) is convergent in the domain (Ince, 1940^) 

exp {2 1 c/(am u) | } < or k i sinh J (am u)\< k' 

1 — k 

where c/(#) is the imaginary part of z. The real period follows as in the former case. 

Now, on the lines u — a + ir* K' (/ = dt 1, ±3, . . .) we have sn u = sin am u =* (k sn cr)~ 1 : and 
hence sin am u is real and its modulus is larger than one. Consequently am u - ^tt + ij (am u) 
along these lines, and 

k | sinh c/(am u) j —k | cos am u | = .^ a , 

I sn a I 

and this is certainly >#. Hence in this case too the lines u — u-v ir f ¥L are tangents to 
the domain of convergence and again barriers preventing any connection between u and 
u + 2$Kf inside the domain of convergence. 

Finally let us turn to the expansion of (17.1) into a series of Legendre functions. A short 
consideration shows that we have to take (g.i)e or (9.6)^ 



2 63 


Expansions of Lami Functions into Series of Legendre Functions 


Now, (9.1)5 is convergent in the domain (9.4), i.e. 


l-d 
1 + d 


< 


i+A'V 

k y 


(18.2) 


The real axis of u corresponding to the interval (k, 1) of the aT-plane, and this interval lying 
in the domain of convergence, the periodicity of this Lame function modulo 2K can be shown 
as before, dn u having this period. 

If there was an imaginary period, it should be 4zK' or a multiple of it, 4*K' being the 
primitive imaginary period of dn u. A short consideration shows that the lines u = a + ir'K.' 
(r'=± i,±3, . . .) are now inside the domain of convergence, but now the lines u = cr + 2ir r K' 
(r'=±i, ±3, . . .) are unsurmountable obstacles. We have (Whittaker and Watson, § 22.34) 


dn (or + 2ir'K!) — -dna (r'=±i, ±3, ±5, . . .), 


and hence the above-mentioned lines correspond to the interval (-1, - k ') of the af-plane. 
This interval is not in the domain (18.2), and hence again we cannot prove the periodicity 
of our solutions modulo 42 K'. 

Incidentally it is seen that our series are convergent in parts of the complex &-plane where 
neither power-series nor the Fourier-Jacobi series converge. 

19. The preceding reasoning does not prove the transcendental Lame functions to be only 
simply periodic, though it does show where the difficulties appear when one tries to establish 
the imaginary period of (17.1). Lame functions of other types present analogous difficulties. 
Either a real or an imaginary period is readily derived from any of the representations; it 
seems very likely that it is the only primitive period of the solution represented by the respective 
series. 

In fact this is the case. For a doubly periodic function of u having the primitive periods 
4K and 4 zK' should be a single-valued function of s, c, d. Now the point u = iK' y where 
j, c and d become infinite, is a branch-point of every solution of Lame’s equation unless n is 
an integer. Hence the transcendental Lame functions belonging to non-integer values of n 
cannot be doubly periodic. A closer consideration shows that transcendental Lame functions 
belonging to integer values of n are not doubly periodic either. 

Hence the only doubly periodic solutions of Lame’s equation are the Lame polynomials. 


Lam£ Functions of Real Period 


20. Now it will be proved that series of Legendre functions of variable ±d or ±djk > 
represent Lame functions of real period. At first we shall assume that n is not an integer. 
According to (7.2), the series (9.1) and (10.1) are convergent in the domain 


1 — d 
1 + d 


i+k' 


(20.1) 


i.e. outside a circle of the aT-plane, the centre of which is d= - J(/§'+ i/i')> the radius being 
-k , )=k‘*l{2k'). In the period parallelogram of the «-plane with vertices u = o, 2K, 
2K+42K', 42K.', the domain of convergence is the part of the parallelogram outside a certain 
rWtym round the line connecting 22K' and sK + szK'. In particular a strip, including the 
real axis, is entirely inside the domain of convergence, and hence the real period 2K or 4K 
according as m is even or odd is readily established. 

In the (f-plane there is a branch-cut along the negative real axis from -1 to The 

corresponding branch-cuts in the »-plane are the segment joining »=*'K' to 22 = 32 K', and all 
congruent segments. 

Replacing P ™(d) by P“(-aQ in (9.1) and (10.1) (the originating series will be denoted 
by (9.x)- and (10.1) - respectively) new Lame functions axe obtained. The domain of 
convergence in the rf-plane will be obtained by reflection of the domain (20.1) on d= o; corre¬ 
spondingly in the 22-plane by shifting about ±2*K'. Hence these solutions are again LamS 
functions of real period 2K or 4K. They differ from (9.1) and (10.1) by being valid not on 
and near the real axis, but in a strip enclosing the line u = a + 22K' (or real) and all congruent 
lines. 



264 


A. Erdelyi 


The series (9-i)Q and (io.i)Q, i.e. the series originating when in (9.1) and (10.1) 
respectively P£ is replaced by Q” represent Lame functions of the second kind. They are 
convergent in the domain where both (9*1) and (9* 1 ) — are convergent. Hence in the ^-plane 
they are neither convergent on the real axis nor on the line u = 2i'K! + 0. Their domain of 
convergence encloses the lines u — + 0 and u — yiK! + 0 (p real). Nevertheless in general 

(t,e. except for integer n) they have no period at all because it is not possible to pass from one 
period parallelogram into another one inside the domain of convergence without crossing 

branch-cuts. . 

21. S imil arly the domain of convergence in the ^-plane of the series (9.6) and (10.4) is 


characterised by 


k’-d 
k' + d 


I + E 9 t n 

< T^ =cot id - 


(21.l) 


It is that part of the ^f-plane which is outside the circle with centre -1(2 - i 2 ) and radius \k 2 . 
The corresponding domain in the period parallelogram excludes the line u — 2zK' + cr and a 
certain neighbourhood of it, and consists of all the rest of the period parallelogram. There 
are again branch-cuts joining tK! to 32K' and joining 22K' to 2K 4 - 22K', and also all congruent 
lines. 

(14.2) and (14.6) are valid for transcendental Lame functions too. 

The domains of convergence of the corresponding series with P n(~d/E) and Qnid/k') 
are obtained by the same process of reflecting (in the ^f-plane) and shifting (in the ^-plane) 
respectively as before. 

22. In case of an integer n > o, we have m > n for the transcendental Lame functions. 
In this case the respective domains of convergence do not change at all but all branch-cuts 
disappear. Therefore in this case we may pass from u to u 4- 2K or u 4-4K, as the case may 
be, entirely inside the domain of convergence of any of our series. Hence every solution of 
Lame's equation is a Lame function in this case, this result being in agreement with the 
investigations of Ince (1940 b } § 5). 


Lam£ Functions of Imaginary Period 

23. Series of Legendre functions of variable ^ or ks represent Lame functions of imaginary 
periods 22fK / and 4 zK\ The discussion of these series being very similar to those with variable 
d and djk\ a brief summary of the results will be given only. In fact, the two types of series 
transform into each other by Jacobi’s imaginary transformation 

dn ( u , k) =k' sn (K!-iTL+zu, k’). 

(11.1) and (11.7) are convergent outside of the circle with centre - \(k 4 - i/b) and radius 
k' 2 l(2k) of the s-plane. The period parallelogram in the fc-plane has the vertices o, 4K, 
4K 4- 2 sK' 9 2 zK f ; the series are convergent in this parallelogram except a certain domain round 
the line joining 3K to 3K4 22EL The branch-cuts are: in the j-plane the negative real axis 
from — 1 to - co , in the a?-plane the two lines joining 3K to 3K 4- 21 K 7 and 2K + tK r to 4K+ zK! 
respectively, and also all congruent lines. For instance, in the strip o < 0 t{u) < 2K an imagin¬ 
ary period is easily established being 22K/ if m is even and 42K' if m be odd. 

The domain of convergence of (11.1) - and (11.7) - is obtained by reflection on ^ = 0 in 
the ^-plane and by shifting parallel to the real axis about 2K in the ^-plane. These series 
too have an imaginary period, established for instance in the strip 2K < M(d) < 4K and in 
congruent strips. 

The series (ii.i)Q and (ii.7)Q represent Lame functions of second kind and are con¬ 
vergent in the domain where both (11.1) and (11.1) - converge. Except for integer n and 
m>n , they have no period at all. 

24. The series (11.2) and (11.8) are convergent outside of the circle with centre -^(1 4 - ijh 2 ) 
and radius £' 2 /(2^ 2 ) of the ^-plane. The period parallelogram is the same as in § 23; the 
series are convergent in it, except in a certain domain round the line joining 3K to 3K 4- 2z’K / . 
The branch-cuts in the &-plane are the line joining 2K 4- * K' to 4K. 4- fK', and all congruent lines. 

The remarks of the preceding section regarding periods, series with P %(~ks) and Q ™(ks) 
apply to this case too. 



Expansions of Lame Functions into Series of Legendre Functions 


265 


Limiting Cases 


25. There are a few remarkable limiting cases in which Lame’s equation reduces to some 
simpler equation. 

Let k approach zero; (2.1) becomes 


d 2 y 
du 2 


4 -^ = 0, 


having the solutions sin tibu and cos hbu. It is interesting to see how the series of Legendre 
functions represent these solutions. 

Take for instance (9.1). For small values of k we have 

d — (1 - k 2 s 2 )k 1 — \k 2 s 2 , (25.1) 

and hence from the expression of P^(0) by a hypergeometric series in § 4, 

PM ~ p;<- sw) ~ (- W’ r( (2S -’> 

Hence if k is very small, (9.1) is approximately 


2(-P)“A m 


m —0 


T(n + m + i) 
T(m +1) 


S m 


CO Q# nTO 

£ o r(m + i) 


(25-3) 


Since in this case lim (k —> o) is finite, we have in the limit k->-o for 


Si m = (- P) m A m r(« + m + i) 

the recurrence formula 

^m+ 2 /cS^m = m?-k= 4(|w - W k)(\m + iVk). 

Hence 

_ ( ~ iV^)m ^2m+l _ (j ~ jV'tymd + 'kV^m - * , \ 

r(2^ + l) °’ T(2^ + 2) (| )m m - 

and we have the two solutions 


*; * 2 ) 

"o r(2»? +1) 

and 

2 i+iVt; h **)■ 

m==0 r(2m + 2) 

Now, since £ = o, we have s = smu and (Gauss, 1866, p. 127, XX and XVI) 
y 1 — S4q cos A&u and y 2 = sin tifrv. 


( 25 * 5 ) 


The characteristic numbers belonging to periodic solutions are obviously k—m 2 . 

26. Another important limiting case is k —> 1. In this case we have to use the series of 
§ 11 and shall choose (11.1). The corresponding limiting processes with the other series 
of § 11 are similar. 

For h —y 1 we have 

* tanh u and P™C?) -> P* (tanh «). 

Hence (11.1) changes to 

2 B W I> -m + i)P™(tanh u\ 


and the recurrence formulae for the B m degenerate into 

{m 2 -n{n +1) +^}B w = o (02 = 0, 1, 2, * . .)• 

Hence all the B w , with the possible exception of a single one, Bm say, are to vanish, and it 



266 


A. Erdelyi 

must be h — n{n + i)-M 2 , the corresponding periodic solution (of imaginary period only) 
being Pjf(tanh u). This result is, except in the notation, in agreement with Dr Ince’s 
investigations (Ince, 1940 0, § 9). 

26. Another limiting case of interest is when k tends to zero and at the same time n to 
infinity, so that lim n(n + i)k 2 remains finite. We shall put £ = 2(- 6 )%jn and make n tend 
to infinity, 6 being kept fixed. We have 

2 B 

d—(i-~ 1 - \k z s z ^ 1 + —1 sin 2 u (n -> °o ), 

and hence (Whittaker and Watson, 1927, § 17.4) 

P ™{d) ~ P™( 1 +“ sin 2 u] ~ n m J m (2z*V Osmu). 

, n / 

Thus Lame’s equation tends to Mathieu’s equation 

d z y 

—+ (^+40sin a «)y = o, 

and the series (9.1) for instance to a series of the form 

2A m J m (2/0i sin u), 

which is Heine’s expansion of the functions of the elliptic cylinder (Heine, 1878, p. 413) 
mentioned in § 2. 

There is another similar limiting process: n tends to infinity so that h - n{n +1) and 
n{n 4- i)(i - k z ) tend to finite limits. This limiting process again yields series of Bessel functions, 
this time from the series of § 10, and so need not be dealt with in extenso . 


Other Expansions 

27. Besides the expansions dealt with already, there are some other expansions of Lame 
functions by series of Legendre functions worthy of mention. 

Applying Whipple’s transformation [Hobson, 1931, p. 245, equation (92), and p. 247, 
equation (93)], 

q:w- - p im> ° 1 <17 ' 2) 

to the associated Legendre functions occurring in our series, new expansions are readily 
obtained. These new expansions, representing the same functions and having the same 
domains of convergence, are entirely equivalent to, in fact only another way of writing of, 
the series already studied and need not be considered more closely. 

Ince’s Fourier-Jacobi expansions of Lame functions lead to another type of series. We 
have, for instance, 

cos mm = (hr sin o))&P*(cos o>), ^ (27.3) 

and therefore Ince’s series (2.2) may be written alternatively 

CO 

y = d(iiTs)i^ C 2 r Pi_i (4 (274) 

r=0 

where C 2r denote Ince’s coefficients, not those denoted by this letter in § 12. 

It is hardly necessary to mention that this latter series, when convergent, is better suited 
for numerical calculations than the series of §§ 9-12. But it seems that there is no similar 
representation of the corresponding Lame function of second kind. 



Expansions of Lame Functions into Series of Legendre Functions 


267 


Summary 

28. This paper contains the investigation of certain properties of periodic solutions of 
Lame’s differential equation by means of representation of these solutions by (in general 
infinite) series of associated Legendre functions. Terminating series of associated Legendre 
functions representing Lame polynomials have been used by E. Heine and G. H. Darwin. 
The latter used them also for numerical computation of Lame polynomials. It appears that 
infinite series of Legendre functions representing transcendental Lame functions have not been 
discussed previously. In two respects these series seem to be superior to the generally used 
power-series and Fourier-Jacobi series, (i) They are convergent in some parts of the complex 
plane of the variable where both power-series and Fourier-Jacobi series diverge, (ii) They 
permit by simply replacing Legendre functions of first kind by those of second kind, to deal 
with Lamd functions of second kind as well as Lame functions of first kind (§ 15). 

In §§ 2 and 8 of the present paper the series are heuristically deduced from the integral 
equations satisfied by periodic Lame functions. Inserting the series found heuristically, 
with unknown coefficients, into Lame’s differential equation, recurrence relations for the 
coefficients are obtained (§§ 9-12). These recurrence relations yield the (in general trans¬ 
cendental) equations in form of (in general infinite) continued fractions for the determination 
of the characteristic numbers. The convergence of the series can be discussed completely. 

There are altogether forty-eight different series. Every one of the eight types of Lame 
polynomials is represented by six different series (see table in § 13). There are interesting 
relations (§ 14) between series representing the same function. 

Next infinite series representing transcendental Lame functions are discussed. It is 
seen that transcendental Lame functions are only simply-periodic (§§ 18, 19). Lame 
functions of real (§§ 20-22) and imaginary (§§ 23-24) period are represented by series of 
Legendre functions the variables of which are different in both cases. 

The paper concludes with a brief discussion of the most important limiting cases, and a 
short mention of other types of series of Legendre functions representing Lame functions. 


REFERENCES TO LITERATURE 

DARWIN, G. H., 1901. “Ellipsoidal harmonic analysis ”, Trans. Roy. Soc. London, A, CXCVII, 

461-557. 

GAUSS, C. F., 1866. Gesammelte Werke , III. 

HEINE, E., 1859 £. “Auszug eines Schreibens \iber Lamesche Funktionen an den Herausgeber ”, 
Journ. fur Math., LVI, 79-86. 

-, 1859 b. “Einige Eigenschaften der Lameschen Funktionen ”, Journ. fur Math., LVI, 87-99. 

-, 1878. Handbuch der Kugelfunktionen, I. 

HOBSON, E. W., 1892. “The harmonic functions for the elliptic cone”, Proc. London Math. Soc., 
xxiii, 231-240. 

-, 1931. The theory of spherical and ellipsoidal harmonics. 

Humbert, P., 1926. “Fonctions de Lame et fonctions de Mathieu ”, Mim. Sci. Math., Fasc. x. 
INCE, E. L., 1940 a. “The periodic Lame functions”, Proc. Roy. Soc. Edin., LX, 47-63. 

-, 1940 b. “Further investigations into the periodic Lame functions”, Proc. Roy. Soc. Edin., 

LX, 83-99. . . 

LAMBE, C. G., and WARD, D. R., 1934. “Some differential equations and associated integral 
equations”. Quart. Journ. Math. (Oxford), V, 81-97. 

LlNDEMANN, F., 1883. “ Ueber die Differentialgleichung der Funktionen des elliptischen Zylinders ”, 

Math. Ann., xxu, 117-123. 

PERRON, O., 1913. Die Lehre von den Keitenbruchen. 

Schubert, H., 1886. “Ueber die Integration der Differentialgleichung “gji 4-£ s U=o”, 
Dissertation, Konigsberg. 

Strutt, M. J. 0 ., 1932. “ Lamesche-Mathieusche und verwandte Funktionen in Physik und 
Technik ”, Ergebnisse der Math, und ikrer Grenzgebieie, Band I, Heft 3. 

Whittaker, E. T., and Watson, G. N., 1927. Modern Analysis. 


{Issued separately February 25, 1948) 



( 268 ) 


XXVIII.— The Discriminant of a Certain Ternary Quartic. By W. L. Edge, 
M.A., Sc.D. (Cantab.), Mathematical Institute, University of Edinburgh. 

(MS. received January 8, 1946. Read May 6, 1946) 

1. By the discriminant D of a homogeneous polynomial <f> is, in accordance with the general 
custom, to be understood that function of its coefficients whose vanishing is the necessary and 
sufficient condition for the locus <f> — o to have a node. It is the resultant, or eliminant, of the 
set of equations obtained by equating all the first partial derivatives of <£ simultaneously to 
zero. If <f> contains n variables and is of order p, the degree of D in the coefficients of <j> is 
nip-iY* 1 . 

Suppose, now and henceforward, that <f> is a ternary quartic <f>(x l9 x 2 , x 3 ). Then D is of 
degree 27, and there is a procedure, attributed by Klein (1890, p. 56) to Gordan, for writing 
down a certain determinant which is a constant multiple of D without any extraneous factors; 
this determinant has fifteen rows, of which nine consist of elements linear in the coefficients 
of <f >, and the remaining six consist of elements of the third degree in these coefficients. Gordan’s 
procedure is as follows. He remarks that if (x 1} x 2 , x 3 ) is a node of (j> = 0, it is common to the 
polar curves <3^ = 0, <^ 2 = o, <£ s =o, where fa denotes dfadxf, the nine quartic polynomials 

x lfay ^ 1 ^ 3 ? -^2013 -^ 2^25 *^ 2 ^ 3 ? # 3 ^ 1 ? ^ 3 ^ 2 ? *^ 3 ^ 3 ? 

whose coefficients are linear in those of fa therefore all vanish at the node. He adds six further 
quartic polynomials, with coefficients of the third degree in those of <f>, which also vanish at 
the node. The determinantal form of D follows instantly. These six latter polynomials are 
obtained by using a certain concomitant of <f> encountered by Dersch (1874, p. 510) in the 
theory of the bitangents, namely 

3 t = 

where the summation is over all non-negative integers^, q, r whose sum is 4 yet none of which 
exceeds 2, a y b i c being Aronhold symbols for = a* = b* = c\. Gordan remarks that T must 
vanish identically my if (x l9 x 2 , x z ) is a node of <f> = o; thus, if 

^ = ^ 11^1 + ^ 22.^2 + ^ 33^3 2 ^ 23 ^ 2^3 + 2 ^ 31 ^ 3^1 + 2 ^ 12 ^ 1 ^ ; 2 J 

the co-ordinates of the node cause each of the six expressions T i3 - to vanish, and these expressions 
are quartic polynomials whose coefficients are of the third degree in those of (j>. 

This strikingly elegant and convincing fashion of writing down a determinantal form for D 
is not at all well known; it has remained inviolate in its adornment of Klein’s paper, and 
escaped the notice both of writers of textbooks and of compilers of the standard articles on 
plane algebraic curves. The object of these few pages is to furnish for the first time an 
illustration of Gordan’s procedure by the actual working out of an example. 

2. We choose for this purpose the specialised quartic 

<£(#i, # 2 , x z ) = ax\ + bx\ + cx\ + 6fx\x\ + 6gx\x\ + 6hx\x% 

and it is this that is henceforward signified by the symbol <f>. This specialised form is used by 
Salmon (1879, PP* 269-274) at the end of the chapter on quartic curves in his treatise, and he 
calculates,^ undaunted by the far from inconsiderable labour involved, several of its invariants 
and covariants. It may be the burden of carrying through these sesquipedalian calculations 
that is the cause of Salmon’s surprising omission to notice what the discriminant of <j> actually 
is; although he alludes (p. 274) expressly to it he conveys no hint that he has detected its true 



The Discriminant of a Certain Ternary Quartic 


269 


structure. Yet the detection is simple enough, 
has four nodes, if 


As 


a $h 
3k b 
3 g 3 / 


For <f >=o becomes a pair of conics, and so 


3 g 

3 / 

c 


has a node at a vertex of the triangle of reference if any one of a , b 3 c vanishes, and has a pair 
of nodes on a side of the triangle of reference if any one of A, B, C (by which are signified 
the cofactors of a, b, cm A) vanishes. Thus it is to be expected that the factors <z&A 2 B 2 C 2 A 4 
occur in D, and since they make up the requisite degree there can be no further factor except 
a numerical multiplier. The application of Gordan’s procedure to <f> gives a result in agreement 
with this. 

3. The working out of the determinantal form for D proves to be quite feasible. For it 

transpires that, with this special form of <f> : ^23 contains only three terms whose coefficients 
do not vanish, namely xfx 2 x 3i x\x z , x 2 x\. The same is clearly true of x z <f> 2 and x 2 <£ 3 . There is 
thus an isolated block of nine elements at the intersections of those three rows of D associated 
with T 23 , #2^35 an d those three columns of D associated with x\x 2 x Zy x\x z > x 2 x\\ “isolated”, 

that is, in the sense that all other elements of these rows are zero. A second such 
isolated set of nine elements occurs in the rows associated with T 31 , cc z <f> ti and a third 
in the rows associated with ^12? x 2 <f> 1 , x x <f> 2 . D is therefore the product of the three three- 
rowed determinants (each of degree five in the coefficients of <ft) arising from these isolated 
blocks and of an outstanding six-rowed determinant. The three-rowed determinants will be 
found to be numerical multiples of AA, BA, CA, and the six-rowed determinant a numerical 
multiple of abcABC A. 

4. Dersch’s concomitant is 


3T s (abc)\ a y>y x + 4&14 + a 2 J>y y + a\b x b y c x c y + a x aj>lc x c v + 


which, since a 3 b, c are equivalent symbols, gives 

t = 

Every symbol, whether it be a 3 b or c 3 appears in every term to degree four and, because of the 
special form of </>, all such combinations of symbols vanish identically except 

a*=b* = c^ = a, a \=b\ = c\=b, a\=b\ = c\ = c, 

44 = 44 = 44 = /> 44 = 44 = 44 = ^- 


5. We proceed now to the calculation of 

2T 23 = (abcflia^jy. + {b 3 c z + VzKVJ- 

Any non-vanishing contribution from 2{abc) 2 a 2 a z P x (P x must contain the product c%a\ and 
so arise from 

4 ^s(Vi"" ^1 ~ ^2 C l 

= 4 f (k z b x c x o 2 — bf> z c^ — ^^ 2^3 4* kf 2 c z c^lf x (? x 

since the symbols b and c are equivalent. Rejecting, next, terms which vanish because the 
symbols b do not appear in the requisite combinations, and then omitting, from the terms 
retained, those which vanish because the symbols c do not appear in the requisite combinations, 
there remains 

8/K^iWi - 2^144*3*8)4 
= 16/* (^ 1 ^ 24 * 3*1 -■/ 44 * 2 *s) 

= l6/"(2£*44*f*2*3 “*./T4*l ^1^2*2 4 £ '3*!D* 2 * 3 ) 

= i6f{(2gh - af)x\ - hfx | -fgx§x ( 5 . 1 ) 



270 


W. Z. Edge 

In order to obtain 2T 23 we have to add to this the contribution from (b 2 c z + b z c^)(abcfc?J) x c x 
which, since b and c are equivalent, is double the contribution from b 2 c z {obcfa 2 x b x c x . This is 

^2^2 ^2^3^3) 1^3^1 ^3^3) (abc) . (5*2) 

Now the two trinomials in brackets, when multiplied together, give a sum consisting of nine 
terms. Of these nine terms one is and this gives rise, when multiplied by (abc)* y to 

six non-vanishing contributions; in fact 

alb\c\{abcf 

=+*P1*! + 4*i4 + 4*34) 

+/ 2 x+(** +/^j+( a/+^ x 2 } 

« (be +f*)(ax\ + A*|+**D + +fg)( hx l + *4 +/*D + (¥+ ^)(X +/* 2 + X)- (5-3) 

Each of the remaining eight terms arising from the product of the two trinomials in (5.2) 
gives, it is found, only one non-vanishing contribution when multiplied by (abc) 2 . The term 
$i c i c i x i x Z) for example, must be coupled with terms of (abc ) 2 that contain the product c x c z ) 
but in this squared determinant every term contains each suffix twice and twice only, so that 
no term which contains c x c z can contain either b\ or b\. Thus, in order to provide a non¬ 
vanishing contribution, b\c x c z x x x 2 must be multiplied by a term of (abc ) 2 which contains 
b\c x c z , and the only such term is - 2a z a 1 b\c 1 c z . The same mode of reasoning shows that the 
term b\c 2 c z x\ in the product of the trinomials must, in order that it should provide a non¬ 
vanishing contribution, be multiplied by the term - 2a 2 a z b\c 2 c z of (abc) 2 . Considerations of 
this kind show that, taking the eight terms other than b\c\x 2 x z from the trinomial product, 

**■ b\c 2 c z x\ 4 - b 1 b 2 c^x z x 1 4 - b 2 b z c^x z + b 1 b 2 c 1 c z x^ 4 - b 2 b 3 c 2 c z x z x^ 4 - b^b 2 c 2 e z XjX 2 

*+* b^ z c 2 c 3 x 2 x 3 )(abc ) 2 

= - Abfx\ x ^ z - 4 hpx\x z ~ ~ 4 /W! + 4fgbx\x 2 x z + 4 fghx\x 2 x z 4 - 4 fghx\x % x z 

- 2f\ax\ + hx\ +gxl)x 2 x r 

Adding this to the product of x 2 x s and the expression (5.3) we find, as the contribution of 
b 2 cpLbcfa l J? z c x to 2T 23 , the product of x 2 x 3 and 

(abc +14 fgh - ap - tfg 2 - $ch*)x\ + 2 (J)ch + bfg- 2 hp)x\ 4 - 2 (beg + chf - 2 pg)x\. 

This result, combined with (5.1), gives finally 

T 2S = {(abc + 30 fgh ~ gap - tfg 2 - 3 4 - 2 (bch 4 - bfg - 6 hp)x% 4 - 2 (beg 4 - chf - 6 pg)xl}x 2 x 3 , 

thus showing, as foretold in section 3, that T as contains only three of the fifteen terms that can 
be present in a homogeneous quartic polynomial in three variables. 

6. Since 

cf>2 = (3^1 + bx 2 4 - 3 f x f) x 2 x 3 

and 

i x 2 &s - ( 3 £ x t + 3 fa l+ca^afjarj, 

the isolated determinant that is associated with the triad of polynomials T 23 , x z <f > 2 , x 2 j > 3 is a 
numerical multiple of 


abc + 30 fgh - 9a/ 2 - 3 Ig* - 3 ch* 

3 ^ 

3 g 

2(bch + bfg-6hfZ) 

b 

3 / > 

2(bcg+chf- 6f 2 g) 

3 / 

c 


the determinant being written in this rather than in the transposed form to economise horizontal 
space. The last two columns are the same as the last two columns of A. Now add to the 
first column the product of ck - 5 fg and the second column as well as the product of bg - 5 hf 



The Discriminant of a Certain Ternary Quartic 271 

and the third column. This adjustment does not change the value of the determinant, and 
it turns its first column into the product of he - gf 2 , which is A, and the first column of A. So 
that the value of the determinant is AA. 

Since the simultaneous cyclic interchanges of the suffixes 1, 2, 3, of the coefficients a , b , c 
and of the coefficients /, g, h turn T 23 , x z <f) 2 , x 2 cf> z into T 31 , x^ x z j> x , respectively, the isolated 
determinant associated with this latter triad of polynomials is a multiple of BA. And that 
associated with the triad T 12 , x^, x^ is a multiple of CA. 

7. The three isolated three-rowed determinants having now been accounted for, it only 
remains to find the outstanding six-rowed determinant. This is constituted by elements that 
occupy those of the fifteen rows and columns which are not involved in any of the three deter¬ 
minants that have been dealt with; it is thus associated with the six polynomials 

Tn, T 22 , T 33 , #1^1? #2^2? ^ 3 ^ 3 > 

but only with those of their terms wherein occur the combinations 

Xjj Xg } X%X Z) XgXjy x 2 x 2 . 

Now 

t u = 

and it will be agreed that the details of the evaluation may be passed over since enough explana¬ 
tion has already been given in the calculation of T 23 to enable a reader to carry through the 
necessary manipulations. The upshot is that JT X1 contains the terms 

6 aghx\ + b(af+gb)x\ + c(af+gh)x\ + {abc + 6 fgh - $af~)x\x\ + z{ach + afg - 2g i h')x\x\ 

+ ziflbg + ctkf— 2gh % )x\x\ 

while the corresponding terms in |T 22 and JT SS are of course derived from these by imposing 
cyclic interchanges on i, 2, 3, on 0, b, c and on/, g, h. Hence, since 

~ 3 ^*a "b 3 £ x i x s> 

2 = bx\ + g,fx JJC3 4 - $bx 

i*s<£s == cx* + 3 g x l x t + 
the six-rowed determinant is a numerical multiple of 

6agh Kqf+gh) c(af+gk) abc + 6fgh - 3a/ 2 ${cah + afg - 2g*k) $(abg+ahf-2gh % ) 

a(bg + kf) 6bkf c(bg + hf) z{bch+bfg-2hf*) abc + 6fgh - 3bg°- 3 {abf+bgh-2k i f) 

a(ch+fg) b{ch +fg) 6 cfg 3>{bcg + chf- ipg) z{caf + cgh - zfg 2 ) abc + 6fgh - yh* 

a 3 g & 

b . 3/ • * h 

c 3/ 3 i 

The factors a, b, c can be removed one from each of the first three columns of this deter¬ 
minant. Having removed them, we then modify the fourth column by subtracting from it 
the product of 3/ and the sum of the second and third columns; similarly we subtract from 
the fifth the product of 3 g and the sum of the third and first columns, and from the sixth the 
product of 3k and the sum of the first and second columns. This produces a block of nine 
zeros in the right-hand bottom comer of the determinant, while the nine elements in the 
right-hand top comer constitute the determinant 

a(bc- 9/ 2 ) $h{ca - gg 2 ) Sgiflb-giF) 

3k{bc- 9/ a ) b(ca-gg 2 ) zfiab - qb?) , 

3g(bc - 9 /*) zf(ca - 9g 2 ) c{ab - 9b*) 

which is ABCA. The six-rowed determinant is thus a numerical multiple of abcABCA. 



272 The Discriminant of a Certain Ternary Quartic 

This is all the information that we require, and Gordan’s procedure has now established 
the discriminant of <f) to be 

D-^A 2 B 2 C 2 A 4 , 

no numerical multiplier being necessary if it is stipulated that the term a 2 b % c 9 has coefficient +1. 
8. Before closing, a word may perhaps be said concerning the more special quartic 

ifs = \{x\+x\+ xf) + 6 ji(x\x\ + x\x\ + x\xl) } 

for which 

a~b=c~\ 

A=B=C=A 2 “9/z, 2 =(A+3/x-)(A-3/ir), 

A = A 3 - 27A^ 2 + 54 i ^ 3 = (A+ 6/x)(A - 3fx) 2 ; 

D = abcA 2 B 2 C 2 A 4 - A 3 (A + 3^) 6 (A + 6 i a) 4 (A - 3^) 14 - 

If A=o, the curve ift = o is trinodal, having a node at each vertex of the triangle of reference. 
If A— - 3^, the curve consists of four lines and so has six nodes; the quadrilateral q formed 
by the lines has the triangle of reference for diagonal triangle. 

If A= -6/r, the curve is a pair of conics and so has four nodes. 

If A=3ja, then not only does A vanish but its rank sinks to x, and all its first minors, including 
A, B, C, also vanish; */r=o is a repeated conic S. 

As A : \l varies, ifj=o describes a pencil of quartic curves which all touch one another at 
the eight intersections of S with the sides of q. Among the members of a pencil of plane 
quartics there are, in general, 27 nodal curves; they correspond to those values of the para¬ 
meter for which the discriminant vanishes. In the above pencil these 27 curves are all 
accounted for by only four distinct curves, these four curves being reckoned with respective 
multiplicities 3,4, 6,14. No other curve of the pencil can possess a node. That the repeated 
conic contributes 14 to the total of 27 nodal curves is noteworthy. 


REFERENCES TO LITERATURE 

Dersch, 0 ., 1874. “ Doppeltangenten einer Curve Ordnung”, Math. Ann ., VII, 497-511. 

Klein, F., 1890. “Zur Theorie der Abelschen Functioned 5 , Math. Ann., XXXVI, 1-83; Gesammelte 
Mathematische Abhandlungen , III, 388-473. 

Salmon, G., 1879. A treatise on the higher plane curves, Third Edition, Dublin. 


(Issued separately February 25, 1948) 



( 273 ) 


XXIX.— On a Problem in Correlated Errors. By A. C. Aitken, D.Sc., F.R.S., 
Mathematical Institute, University of Edinburgh 

(MS. received January 1 6, 1946. Read May 6, 1946) 
i. Introductory 

The problem with which this paper is concerned arose in the discussion of a series of chrono- 
metric observations, but it is of more general application, and is capable of wide extension. 
Pairs of readings (x i} j y$) were taken at times t iy i = 1, 2, . . ., n. These readings were known 
to be affected by respective errors (| 4 -, rj z ) from sources different but possessing some common 
part. It was important to have an estimate of the consequent correlation and to assess its 
precision. The assumptions made in the particular experiment were that & and y were both 
linear in representable by x = a Q + arf, y = b 0 -h b x t, and that the distributions‘of error in x and 
y were normal. The parameters a Q and a l3 b 0 and b x were therefore obtained from two separate 
sets of normal equations, and the unknown correlation was then estimated from the sum of 
products of corresponding residuals %, v i3 one from each set. In the corresponding situation 
in n samples (x i} y t ) from a bivariate normal distribution the mean value of - x)^ -y) is 
(n -1) poio-g, where o^ 2 , <t 2 2 are the variances of # and y and p^o^ is their product moment. 
One might therefore anticipate, by analogy, that in the present case the mean value of E 
would be (n - 2)pcr 1 cr 2 . So indeed it proves to be, and the sampling variance of Ezq-z/ 2 - conforms 
likewise with standard results; but it is desirable, by an extension of the problem, both to see 
why this is so and to take notice of cases where the analogy fails to hold. 

2. Generalization of the Problem 

We shall extend the problem, adopting notations suited to a matrix formulation. Let there 
be n independent pairs of observations {x z + £ i} y t + 7^-), where 7) £ are errors normally dis¬ 
tributed with variances a* 2 , o- 2 2 and product moment pa x < 7 ^ Let x andj y be represented by 

x=a a +a 1 j> 1 (t) +a 2 M*) + ■ ■ • (i) 

y = d 0 + d 1 p x (i) + b. 2 p 2 (t) + . . .+£*_ ipic-iQ), (2) 

where the /,■(/*) are values of a prescribed basis of functions, linearly independent over the 
range of /. These might commonly be polynomial or harmonic, but might even be arbitrary. 
Let us write 

*={* 1*2 • • • x n\ y={yiy% • • -yJ; ( 3 ) 

0^1 • • • i}j b={byii . . . (4) 

and so on for all vectors concerned, and let P— [_&•(/*)] be the nxk matrix of the functional 
values. Then the two sets of observational equations are Pa=x, Pb —y; the normal equations 
are PPa=P'x } P'Pb—P'y , and so the solutions for a and b are 

a — (P'P^P'x, b~{P’P)- l P , y. (5) 

The vectors of residuals are therefore 

u=Mx, v=My , where M^I-PiP'P)- 1 ^, (6) 

and so the variance matrices of the two separate sets of residuals are sine eMM'=M m 

In view of the linear independence of the k functions p $ (t) we see that Pis of rank k, and so 
PiP'P)-^' is of rank k, and M is of rank n-k. Again, since JMP-M, it follows that M is 
idempotent with n-k latent roots equal to 1, and k roots equal to o. It is therefore reducible 



A. C. Aitken 


by orthogonal transformation HMH' to a diagonal matrix in which the leading submatrix 
of order n-k is a unit matrix, all other elements being null. 

The variance matrix of the partitioned vector z—{x \y} is 

___ r<r x 2 / pWl ™ 


Lp<W 


o*iy 


a matrix of order 2n x 2n } quadripartite as above. The multiple normal probability differential 
of ^ is thus, apart from a constant factor, exp (- \z f V~ 1 z)dz i where 

dz~dx x dx 2 . . . dx n dy x . . . dy n . 

Let us write namely 

u’v = x'M'My = (8) 

as a quadratic form in 2, 


We construct a moment-generating function (m.g.f.) £(a) for u*v by evaluating the 2^-fold 
integral of 

const, exp (\az'Lz) exp (- \z* F~V) (10) 

over the range - 00 to 00 in all variables. By the standard and well-known result, since a 
continuous range of values of a can be found such that the quadratic form in the exponent is 
negative definite over the whole range, we obtain 

G(a)= \I-aVL\ -i , (ix) 


the constant factor being fixed by the fact that the term independent of a must be the moment 
of zero order, namely unity. Writing this result in partitioned form, we have 

I - apcr^M aa x 2 M 
aa 2 2 M I - apa 1 G 2 M 

I - 2apCF 1 G 2 M - a 2 (i - p 2 )cr 1 2 G2 2 M | ( 12 ) 

since the submatrices are commutative in multiplication, and M 2 —M. The orthogonal 
transformation H(. . .)H' can now be applied to the matrix enframed in this determinant, 
thus producing a purely diagonal matrix having units for its last k diagonal elements. So 
finally, expanding by the product of the first n-k diagonal elements, we have 

G{ a) ={1 - 2pcr 1 cr 2 a - (1 - p 2 )ai 2 oi 2 a 2 }“l (n ~* ) . (13) 

The coefficient of a in the expansion of the above gives the mean value of ^u i v i 
as (n - The coefficient of a 2 /2! gives the mean square, and this, when adjusted so as 

to refer to the mean as origin, yields the sampling variance (n -k)(i + p 2 )o i 2 cr 2 2 . 



3. Analogy with Standard Results 

The form of these results is not unfamiliar. The m.g.f. of the product-moment estimate 
from a sample of n bivariate normally correlated observations can be derived (cf. Aitken, 1931) 
by a similar procedure, and is the same as G( a) in § 2 (13), but with n -1 instead of n-k. 
It is indeed the special case in which the representation of x and y is by single constants. By 
a different approach Wishart (1928) found in this case the values of the sampling variance of 
the product moment, as well as many higher moments. We shall now trace the analogy to its 
source, and shall also show under what conditions it fails to hold. 

Let us make an orthogonal transformation from x to Hx, y to Hy, where H is such that 
HH’—I and HMH 1 =/„_£, this last matrix being that particular diagonal canonical form of 
M which has the first n-k diagonal elements equal to unity and the remainder null. We have 
then Hu Hv—J n ~ k Hy, while u r v — u'H'Hv, ^'rj — ^'H'Hr]. Further, since the 




On a Problem in Correlated Errors 275 

errors | are uncorrelated, so are their transforms H %; likewise Hrj ; and the respective variances 
are unchanged, since = ^'H'H^, 7/77 = rj'H'Hr]. 

Thus the problem of the paper may be transferred entire, but in a simpler version, to the 
equivalent vectors Hx, Hy , Hu, Hv, Hi;, Hr]; and since Hu now consists of the first n-k 
elements of Hx, and Hv of the first n-k elements of Hy , we have therefore n-k linearly 
independent transformed residuals Hu correlated with a similar set Hv. In the bivariate 
sampling problem to which allusion has been made we have a similar pairing of sets of n - 1 
linearly independent transformed residuals, the loss of a unit of rank (“degree of freedom”) 
being due in that case to the fact that the original residuals were deviations from the means of 
sample. Thus n- 1 in the earlier problem plays the part of n-k in the present one. Indeed 
the derivation of G(a) in the earlier problem (Aitken, 1931) might have been simplified in a 
number of respects by the methods of the present paper. 

- Naturally the question of the adequacy of the representation of x and y will in every case 
be the subject of examination a posteriori , by the usual division of the sum of squared devia¬ 
tions from the mean into two parts, namely the part used up in fitting the Menus and the part 
that may properly be called the sum of squared residuals. The principles of this “analysis 
of variance”, as applied (Fisher and Yates, 1943) to the fitting of orthogonal terms, are well 
known. 

4. Case of Different Representations 

For its tractable handling the problem clearly depends on the condition that the basis of 
functions chosen for the representation of x and y shall be the same for each. If a different 
basis is chosen for each, expressed by matrices P and T, say, then the transforming matrices 
for the residuals, M=I - P{P , P)- 1 P' and JV=I-T(T'T)~' l T\ will in general be non- 
commutative. For example, if x is to be represented by a basis of polynomial terms, and y 
by one of harmonic terms, the evaluation of the determinantal expression for G(a) will become 
more complicated. There is, however, one intermediate case in which the commutative 
rule still holds. Let it be supposed that by suitable linear combination we have replaced the 
basis P by an equivalent orthonormal basis Q. This can always be done. Let x be expressed 
in terms of Q h and y in terms of Q k , where Q k is obtained from Q h by deleting the last k-k 
columns. Thus x is expressed by h functions, and y by a subset of k of those functions, all 
from one orthogonal set. Denoting QQ f , the “graduating matrix” (Aitken, 1945), by G, 
we have {loc. cii.) G h G k = G k G h = G k . Further, if we denote the vectors of residuals by 
u=M h x , v=M k y, we have 

M h M k =M k M h — (I-G h )(I - G k )=I-G h =M h . ( 1 ) 

The procedure adopted in § 2 then leads to 



and so the partitioned determinantal form for G(a) becomes 

£( a ) = /-apo-jcr^ -1 ^ 

oxj 2 M k I - afxr 1 G 2 M k 

This again, since the submatrices are commutative, becomes 

11 - apa x <j 2 (M h + M k ) - a 2 (i - p % )a^<y^M h | ~K (4) 

Now in this case M h and M k can be simultaneously transformed to diagonal canonical 
form by one and the same orthogonal matrix. For let us extend the orthonormal basis until 
it includes n functions q s (f\ y = o, 1, 2, . . ., n-i. Its matrix Q is then a non-singular 
orthogonal matrix of order nxn. Evidently we have 

Q'(QhQh)Q= [ 7 ][/ -3=[ 7 '] =A, (5) 

where the submatrices denoted by I are of order h x h. This means that the canonical form 



A, C. Aitken 


276 

of the complementary matrix M n =I-G n has its last n-h diagonal elements equal to unity 
and the rest null; but for our purpose it does not matter whether the units are in leading 
positions or last positions. Transforming therefore the matrix enframed in (4) above by 
< 2 '(. . .)<?, and expanding by diagonal elements (of three different kinds), we obtain G(a) as 

[ 1 - 2pcr 1 c 2 a - (1 - p 2 )o-?<j£a % | | 1 -po^a - (1 -pfyjfcrfa* | ~i (h - k K ( 6 ) 

The result is not so simple as before. The coefficient of a gives the mean of Sas 
\(2n -h- typcr^. The coefficient of a 2 /2! gives the mean square, and this, adjusted so as to 
refer to the mean as origin, gives the sampling variance 

{(* - &)(* + p 2 ) - #(* - (7) 

again less simple than before. 

The sampling probability distribution of the estimateof product moment, £#^/(# -\h- %k) } 
is also different in this case from that of the estimate 'Ln i v i l(n -k) derived from § 2 . It is 
known (cf. Wishart and Bartlett, 1932, 1933) that in the normal bivariate sampling problem 
the estimate of product moment is distributed according to a certain Bessel function of the 
second kind with imaginary argument. For the case of § 2 we have merely to take the Bessel 
function, as given in Wishart and Bartlett, and change n-i into n-k. In the same way the 
distribution of r, the estimate of p obtained from the mean squares and mean product of § 2 , 
is Fisher’s distribution of r (Fisher, 1915) with n-k degrees of freedom. But these no longer 
hold in the present section; the distribution of estimate of product moment, for example, will 
involve at least a series of Bessel functions, and the distribution of the estimate of p will be 
more complicated than that of Fisher. 

The simultaneous orthogonal transformation of M h and M h can be given a further extension. 
Let us suppose that x is represented in terms of h functions or columns from Q, andjy in terms 
of k columns, neither of these sets being necessarily from consecutive columns; and suppose 
that these sets have ^ columns in common. If we denote the matrices of these three sets, 
namely the first two and their “ intersection ”, by Q h , Q k , Q s , it follows easily under simultaneous 
orthogonal transformation by the whole matrix Q that G h G k = G k G n = G s , and that the diagonal 
units in the canonical forms of G h , G k and G s are in those respective columns that characterize 
Q n , Q k and Q s . The complementary canonical forms of M h and M k have a community or 
intersection involving n-k-k+s unit diagonal elements, and another of j zero elements, the 
remaining k+k-2s units appearing singly among them. We have therefore, for G( a), 

I 1 - 2pcr 1 (j 2 a - (1 -p 2 )a^a 2 2 a 2 | -¥ n ~ h ~ 7c + s '> | 1 - pa^a -(1 - p 2 )o* 1 2 or 2 2 a 2 | ( 8 ) 

from which the mean of 2namely \(yn - h-fypa^, and the variance 

{(* - S)(l +p 2 ) -| (i+k- 2(9) 

are derived. 

The extension considered here is by no means far-fetched. It might easily be the case, for 
example, that x was represented by a set of odd functions, y by a set of even functions, taken 
from one basis Q. 

5. Case of Different Basic Representations 


We have next the case in which the respective bases for x and y belong to different sets of 
orthogonal functions. There is then in general no such relation as MN—NM ', where we use 
M and N for the transforming matrices for residuals, nor is there any simultaneous transforma¬ 
tion of M and N to diagonal form. If x is polynomial and jy harmonic, for example, we cannot 
combine the two bases and orthonormalize them, for that would have the effect of making 
both x and jy a mixture of polynomial and harmonic functions. However, we reach without 
difficulty the stage of § 4 (3). Here the first check occurs; but by linear operations upon 
“rows” of submatrices, 


we obtain 


row x + p—^ row 2 , row 2 - 

°a 

G(a) = 1 I- apo&^M+N) - a 2 (i - p^crfa^MN | 


(1) 

(2) 


the relation of which to the earlier forms of G(a) is readily perceived. Since we cannot 



On a Problem in Correlated Errors 277 

simultaneously transform M and Vinto diagonal shape, the most that we can do is to record 
the mean value, the coefficient of a in the expansion. This is the trace of \pa^fM^N), 
namely \(2n-k~fypcr^ since the separate traces of M and Nsnen~h, n-k. For"example, 
if x is represented by a quadratic polynomial and y by 5 terms of a Fourier series, the mean 
value of S u^) t will be {n - f)pv x a 2 . 

The sampling variance of S up u on the other hand, is in this case difficult to evaluate, 
though quite explicit. The coefficient of a 2 /2! in the expansion of G(a) involves the diagonal 
elements of MN, as well as all principal minors of the second order in M+N, and though 
these might be found and summed in any particular case, the process will be laborious if n is 
large, for there are \n{n -1) such minors. The fact is that there are no convenient general 
formulae for the traces of compounds of a matrix sum A +B. Therefore we leave G(a) in 
determinantal form. 


6 . Extension to Multivariate Case 

Finally, as might be anticipated, the result of § 2 (13) admits of extension to the case of 
s 1 variates (x, y, 2, . . .) observed at a succession of times t. Let us assume that the vector of 
^ errors e* at each observation obeys a multivariate normal law characterized by e Then, 

provided always that the representations of x, y 9 z, . . . are in terms of the same basis of 
orthonormal functions, a joint m.g.f. for all sums of squares and binary products of corre¬ 
sponding residuals may be constructed by assigning an indeterminate a n to carry the moments 
of each; and one arrives, through the commutativity of all submatrices in a partitioned deter¬ 
minant, at the following function of the elements of the matrix A — [a* J, 

G(A)= \I-VA\ ( 1 ) 

Corresponding expressions, involving many factors like the above, could be derived for the 
case where x, y, 2, . . . are represented by different selections, with varying community of 
columns, from the same orthonormal basis given by Q. The question is one of the inter¬ 
sections of complementary sets of given sets; and a combinatory diagrammatic rule can be 
found for the several exponents. These wide generalizations are remote from application. 
One can do well enough by estimating variances for one variate at a time, and product moments 
according to pairs of variates. 


REFERENCES TO LITERATURE 

AlTKEN, A. C., 1931. “Some Applications of Generating Functions to Normal Frequency”, Quart . 
Journ. Math., Oxford Series, II, 130-135. 

-, 1945 - “On Linear Approximation by Least Squares”, Proc. Roy . Soc, Edin LXII, 138-146, 

143 . 

Fisher, R. A., 1915. Biometrika, x, 507-521. 

Fisher, R. A., and Yates, F., 1943. Statistical Tables for Biological, Agricultural and Medical 
Research, Oliver & Boyd, Edinburgh, pp. 20-22. 

WlSHART, J., 1928. Biometrika, XX, 32-52, 44. 

Wishart, J., and Bartlett, M. S., 1932. Proc. Camb. Phil. Soc., xxvm, 455 - 459 * 45$. 

-, and-, 1933. Proc. Camb. Phil. Soc., xxix, 260-270, 267. 


\Corrigendum .—The opportunity is taken of making a correction in a previous 
paper, “ On the Independence of Linear and Quadratic Forms in Samples of Normally Dis¬ 
tributed Variates”, Proc . Roy. Soc . Edin., lx, 1940, 40-46. On p. 45, the coefficient of the 
middle term in the bracket on the left of (3) should be + 2, not — 2. The rest of the section 
is to be deleted except for the sentence, “ This is not independent of jS”.] 


(Issued separately February 25, 1948) 



( 2?8 ) 


XXX— On Hill’s Problems with Complex Parameters and a Real Periodic 
Function. By M. J. O. Strutt. Communicated by Sir Edmund Whittaker, 
F.R.S. (With Five Text-figures.) 

(MS. received November I, 1945. Read May 6, 1946) 

Summary 

Hill’s differential equation (1.1) derives its importance from being the prototype of the 
different equations of Lame and of the equation of Mathieu, which are connected with wave 
and potential problems in mathematical physics. Besides this, numerous instances of its 
occurrence in problems of elasticity and of dynamical or statical stability are known. In the 
present treatment, conditions are reversed with respect to most of the older publications, since 
the characteristic multiplier <7 of equations (1.2) is not sought as a function of the given para¬ 
meters A and y of equation (1.1), but a is supposed given and the corresponding values of A and 
y are regarded as unknown. Thus a linear homogeneous boundary value problem of the 
second order and of non-self-adjoint type ensues, the values of a and of A, y being in general 
complex . On this latter point the present paper considerably enlarges the scope of some 
previous papers published by the author during the war along somewhat similar lines but for 
real characteristic values (Nos. 13-18 of the references at the end). 

In § 1 the problems are stated together with a number of recently found facts pertaining to 
the case of real characteristic values. In § 2 Green’s function, on which the present treatment 
is largely based, is calculated in terms of two linearly independent solutions for general values 
of cr by solving eight linear equations with eight unknown constants. From Green’s function 
some lower bounds for the characteristic values of smallest modulus are derived. These 
bounds are illustrated by figs. 1-5 in some cases which are of interest from the point of view 
of applied mathematics. The asymptotic solutions of equations (1.1) and (2.2) are dealt with 
in § 3 considering the two cases according as 1 + (/><& (z) has zeros or not. In the former case, 
the occurrence of Stoke’s phenomenon entails elaborate expressions for the asymptotic solutions 
under different conditions, based on publications by R. E. Langer. The equations determining 
the characteristic values are given in both cases, including two theorems on the occurrence of 
real and complex characteristic values in the latter case. The discussion of the equations 
determining the characteristic values is continued in § 4. It is shown that the characteristic 
values corresponding to j cr | = 1 are spaced at regular intervals in the latter case. By the 
examination of eighteen sub-cases the corresponding characteristic values are found to be 
clustered in very narrow intervals, spaced regularly in the former case. An estimate of the 
width of these intervals is given. Attention is drawn to the similarity of this situation to the 
results of several previous investigations, the present result including the latter as particular 
instances. Section 5 is devoted to the application of A. L. Cauchy’s integral theorem in 
obtaining a convergent infinite series expansion of a generalized Green’s function. The 
second case (1 -f <f>&(z) having no zeros), mentioned above, is dealt with in the first place, the 
said series being derived by application of the corresponding asymptotic solutions. In § 6 the 
first case (1 having zeros) is examined, leading to twenty-four sub-cases, which may 

be reduced to twelve. The fundamental formulse for their discussion are given, and the 
discussion itself is recorded in one of these twelve cases. The results obtained in the remaining 
eleven cases are quoted. In order to complete the said series expansion, the residues are 
calculated in § 7. Hereupon the absolute and uniform convergence of the series obtained is 
proved in § 8, for the two cases mentioned, by application of the asymptotic solutions of § 3. 
The series are then applied in § 9 to the expansion of arbitrary functions as well as to the 
solution of Hill’s linear inhomogeneous problems. Finally in § 10, by the iteration of Green’s 






28 o 


M. J* O. Strutt 

functions, expressions are obtained for the characteristic values as limits of certain integral 
operations. The characteristic functions are obtained by similar processes, and the procedure 
is recorded in the cases of suffixes i and 2. The formulae thus obtained are thought to be of 
considerable practicable importance. 


1. Statement of Problem and of some Previous Results 


We shall consider solutions of the differential equation 


+**>(*){ A+y$(*)}=o> 


(1.1) 


where z is a real variable, A and y are real or complex parameters, and ®(a) is a real function 
of 0, periodic with a fundamental real period £, satisfying Dirichlet’s conditions and not 
persistently zero for any finite interval of z. The solutions w are required to satisfy the linear 
homogeneous conditions: 

w{z + 0 =ow(z ); \ 

w\z + £) = aw'(z)J 1,2 


accents denoting differentiation with respect to z . The parameter cr, indicated as 11 character- 
istic multiplier”, may be real or complex. If a is given a definite value, the parameters A and 
y together with the function w being regarded as unknown, we have a linear homogeneous 
boundary problem, A and y assuming “ characteristic values ” for any solution w. The latter 
will be assumed continuous in itself as well as regards its first derivative with respect to 0, and 
not persistently zero throughout an entire period. The problem thus formulated will be called 
Hill’s boundary problem, whilst equation (1.1) is Hill’s equation [22].f Mathieu’s equation 
and the different forms of Lame’s equation pertaining to potential as well as to wave-problems 
may be considered as particular cases of equation (1.1) under certain conditions [19]. f This 
boundary problem is not self-adjoint (equation (1.1) being so, but equations (1.2) not). By a 
simple transformation the boundary conditions could be made self-adjoint, the resulting 
differential equation then being not self-adjoint. Besides Hill’s problem as formulated in 
equations (1.1) and (1.2) we shall also consider the adjoint problem, in which w* satisfies 
equation (1.1) with w* instead of w but different boundary conditions, adjoint to (1.2): 

ow*(z +0 = w*(z ); 1 

C rw*'(z+Q = w*'(z).j (l-3) 

The solutions w and w* of these two adjoint problems, owing to their homogeneous character, 
each still contain an arbitrary multiplier. 

If a pair of parameters A, y together with a particular function w(z) satisfy the equations 
(1.1) and (1.2), the same pair of parameters, together with a function w*(z) (in general different 
from w), satisfy the equations (1.1) and (1.3). Thus it may be stated that Hill’s problem and 
its adjoint one have the same characteristic values of A and y. The functions w and w* 
corresponding to different pairs of parameters indicated by A a , y m and A d , y 6 by Green’s 
integral theorem satisfy the bi-orthogonal relation 

I {K - A* + (y« - y b )< 3 >(z)}w(z)w*(z)dz = 0. (1.4) 


If | a | = 1 and a is real, the functions w and w* are periodic (a = +1) or half-periodic (cr= -1). 
If | a | = 1 and <j is non-real, the functions w and w* are almost-periodic in the sense of 
H. Bohr. If | a | < 1, the functions w and w* are neither periodic nor almost-periodic but will 
be called pseudo-periodic. All three cases have been treated in previous publications, but 
most of the published results pertain to the former two cases. It was shown [17, 18] that all 
characteristic values of y are then real under the above assumptions, if A is real, and conversely, 
and they are then situated on a set of continuous discrete curves in the A-, y-plane, of infinite 

f These numbers refer to the list at the end. 



On Hill s Problems with Complex Parameters and a Real Periodic Function 281 

order, dXjdy varying continuously along each curve [15]. Any two separate curves can only 
intersect in the finite part of the A-, y-plane if cr=±i, and no curve can ever intersect itself 
here, A being an integral function of y along each curve if | <7 | = 1. A number of theorems, 
determining general and asymptotic (for £ 2 ] A ] + ] y ] >> 1) properties of these curves, has 

been derived [15, 16]. 

In the present general case, however, the characteristic values are not always real. It is 
easy to prove that A is an analytic function of y in any case. If w x and Wjj are two independent 
solutions of equation (1.1) satisfying the conditions w x {zq) = 1, ze/'(* 0 ) = o, K/ n (%)=o J zv' n (z 0 ) = 1, 
and if z 1 = z 0 + £, we have by equations (1.1) and (1.2) [19]: 

cr 2 - a{wx(zj) + zei(si)} + 1=0. (1.5) 

As w x and z£/n are integral functions of A and y, an analytic relation exists, by equation (1.5), 
between these parameters for any definite value of a. 


2 . The Green’s Function of Hill’s Problem 


In equation (1.1) we may introduce 

A-A 0 =A, y — r 0 =F, 

and obtain 

d 2 w 

+^(*){a 0 +r 0 <D(^)} 4* w(z){ a+ ro(*)} - o. (2.1) 

If r/A = <^ and A 0 , r o , <j> are regarded as fixed, we thus obtain a one-parametric problem: 

1j[w(z)] + Aw(z){i + </><D(z)} — o, (2.2) 

where L [w] is an abbreviation for the sum of the first two terms of (2.1). In case A in equation 
(2.1) is constantly zero, equation (2.2) has obviously to be replaced by 


L[ze/] +rze/<I>(2) =0. (2.3) 

In the case of real parameters A, y the parameters A and/or T are connected with straight lines 
in the A-, y-plane, including the point A 0 , F 0 [15]. 

We now consider a solution G (Green’s function) of Hill’s problem, satisfying the con¬ 
ditions: (a) it is a continuous function of two variables z, t; (b) L[G] =0, except at 2=/, with 
either 2 or / as independent variable in the expression L (being a linear differential operator 
connected with the problem in hand); (c) G satisfies equations (1.2); (d) G' is discontinuous 
at z — t according to the condition: 

lim [G‘{z, = -1. (3.4) 

e —>0 

By 

G(z, t) — A ywjit) + KtfVjiit) if z 0 < z < t, 

G(z, t) — B 1 w 1 (z) + B 2 ^(d) if t < z < z^Zq* £; 

and * 

Ai = A n wx(z) + A 12 wu(z), 

A 2 = A 21 w 1 (z) + A 22 wjx(z), 

B z = Bn^i(/) +B 12 w n (t), 

B 2 = B 21 ^i(/) + B 22 ^nWj 

the functions wz and wjz being independent solutions of L[w] = o satisfying the conditions: 
wz(z Q ) — 1, w^Zq) — o, w^Zq) = o, w'j^Zq) = 1, Green’s function results from the solution of 
eight linear equations for the eight constants A n , A 12 , etc., obtained by applying the above 
conditions to G and varying /. The result is: 


{ 0-^(2 + 0 - wi(s)}™-di) ~ {own(.z +Q- z»n(z) W-0 _ 

G(z,t) = -—- r - 7 ~ —;- if *2/, 

^/_ W*+ 0 -<jw 1 {t)}w xl (z)-{wj 1 {t+ 0 -ctwh(/)}z»i(z) ^ 


(2-S) 



28s 


M. J. 0. Strutt 

These equations may be easily verified by again applying the said conditions to G. Obviously, 
if the denominator is zero, equations (2.5) are invalid, and we then have to fall back on the 
modified Green’s function [2]. Equation (1.5) corresponds to the latter case. If the conditions 
(1.3) are substituted for (1.2) we obtain the adjoint Green’s function G*(z, t), satisfying the 
relation 

G*(z,t) = G(t,z). (2.6) 

The generally non-self-adjoint character of Hill’s problem, if a does not coincide with some 
special values, expresses itself by the asymmetry of G(z, t) with respect to z and 

With the aid of Green’s function several bounds for characteristic values may be obtained. 
Considering the problem (1.1), (1.2) we have: 

< T + f f <X> 2 (,) | Q\z, t) | dzdt> ( 2 . 7 ) 

I 7i 1 Jz JS 

y 1 being the (or a) characteristic value of smallest modulus, and the operator L being 

d 2 w 

Similarly, in the case (2.2), (1.2) we obtain: 

T^i s j‘ +f £ +f | {1+ waygk*, t) 1 dzdt, (2.8) 

A x again being the (or a) characteristic value of smallest modulus, and 

d 2 w 

lm s 1? +a '^ A ° + 


As an application of equation (2.7), let the maximum value of <£> 2 (s) be <E> 0 2 . Then 

= $o 2 F; F =j] +? f +? l *> I dzdt - ( 2 - 9 > 

We obtain, inserting wj » cos {VA (z - 2 0 )}, w n — sin {V A (z - # 0 )} in equation (2.5): 


\ 

F F 2 

F x - (c^ 2 4- a 2 2 ){(p + 1) 2 4- 4<j 2 p sin % sh a 2 - 4a 1 (p 4 - 2) cos a x ch a 2 

+ 2{p 4 - i)(ch 2a 2 4 - cos 2a x ) 4 - 2(cr 1 2 -&%) 4 -1}, 


4 a 2 


4 «i 


202 


2<X 1 


shaQ sin a, 

-crrfp 4- 2)1 cos a x --ch 02 ) 4- 

( 2 ttg 2 u x 

f „ ch 02 cos Or, \ 

• 2 M^ + ^)^ sha * 


<r=cr 1 + 2<r 2 , 


t= +V-1, 


InWmf or 1 y^o 15 


tVx = cq + z'a 2) 
l 2 


-V IFI 


Vf 2 


(2.10) 


In some cases, pertaining to c = ± i, <r = ± *, o=5,0= 5 + 5/, the expression £ 2 /4 tt 2 | VF | has 
been calculated numerically and is shown in figs. 1-5, in which A = A, + fA*. These results may 
be directly applied to the equations of Mathieu, Lam6, and to related equations, which occur 
in connection with problems of elastic stability. 

Other and closer bounds for the characteristic parameter values of smallest moduli will be 
given in the course of this paper (§ 10). 



On HilPs Problems with Complex Parameters and a Real Periodic Function 283 


3. Asymptotic Solutions and Characteristic Values 

The asymptotic solutions of Hill’s problem are fundamental for our further argument. 
Equation (2.2) will be used in obtaining these asymptotic solutions, the related equation (2.3) 
being only a special case. If | o* | = 1 and <f> is real, all the characteristic values A are real, 
but if | (j | 5 1 this is not always true, as will be shown presently. Two different cases will 
be considered: (a) has no zeros; (b) 1 has a (finite) number of zeros within each 

period £. In the case (a) cf> may be real or complex, but in the case (b) it must be real as ®(s) 
is real. 

We turn our attention to case (a). Fractional powers of complex quantities will, by 
convention, be made determinate as follows:— 

S a = { I q I exp (z'arg f)}° = | | q | a | exp lia arg q). 


The argument of real quantities is taken as zero. 
Then, assuming 


equation (2.2) becomes 


c z 

w (z) = {1 4- cj>(S>(z)}iw(z)j z = {1 + cf><S>(z)}kdz, 

Jz 0 


d^iz) 

dz 2 


4-Y 0 (z)w(z) +Aw(z)=o, 


(3*1) 


where the function Y 0 may be calculated from ® and is, by our assumption, bounded in 
modulus, z 0 being a constant of integration. Asymptotically, if | A£ 2 | co the solutions 
of (3.1) are: 

Yl 


w(z) ^ A exp (± izv An 1 + 0| 


jzVA)|i 


WA j /j’ 


(3-2) 


A denoting an arbitrary multiplier and the sign O having the usual sense. These solutions 
are valid for values of z and A, for which the imaginary part of zVA is either invariably 
below or invariably above a constant. Combining equations (3.2) and (1.2) we obtain: 


Hence 


exp 




1=0- = ex] 


V i +cf)(f>(z)dz h = cr = exp (/). 


* /a _ i 2 miri 

z yA m ^ ± pz+£ 


{1 +<f)<J?(z)}%dz 


m = o, 1, 2, 


( 3 - 3 ) 


The value ju is sometimes called the characteristic exponent of Hill’s problem. Equation (3.3) 
determines the asymptotic characteristic values of A and settles the question of their being real 
or complex. Thus, if | a | = 1 , jjl is purely imaginary, and hence the characteristic values are 
asymptotically real and positive if <56 is real and 1 +<j>Q)(z) > o, and are real and negative if 
1 +<(>$>(z) < o. Two theorems result from equation (3.3): 

Theorem 2.1. If | cr [ ^ 1 in equations (1.2), and 1 4-<£<£> in equation (2.2) has no zeros, 
the number of real characteristic values of A is finite, the number of complex characteristic 
values being enumerably infinite. 

Theorem 2.2 . If } cr | = 1 in equations (1.2), and 1 4 -<f><&(z) in equation (2.2) has no zeros, 
the number of complex characteristic values of A is finite, the number of real characteristic 
values being enumerably infinite. 

Turning our attention to the case (b) (1 4-<£® having zeros), the occurrence of Stokes’s 
phenomenon complicates the discussion considerably. In order to keep it within reasonable 
bounds, some further assumptions will be introduced. The function 1 4of equation 
(2.2) (or ®(s) of equation (2.3)) is supposed to have two simple zeros within each period, situated 
at a x and a % if the said period is taken from z 0 to z 1} and z 0 < a x < a 2 < z v Homologous 
adjacent zeros outside this interval are: a 0 < z 0 and a s > a 4 > a z > z x =4- £* It will further 



284 M. J. 0 • Strutt 

be assumed that 1 -1-<£<E>(s) is negative if a 0 < z < a x (interval I), positive if a x < z < a 2 (interval 
II), negative if a 2 < z < a 3 (interval III), positive if a z < 2 < (interval IV), and negative if 
# 4 < z < a $ (interval V). The zeros a 0) a 1} a 2 , a s , a ,*, a 5 are excluded from the said intervals. 
The asymptotic solutions, as applied in this and in subsequent paragraphs, are simple deductions 
of formulae, given by R. E. Langer [8, 9, 10]. Only a few principal steps of this derivation 
will be quoted. The remainder terms, approaching zero if | A£ 2 | -> <x>, will be omitted for 
sake of brevity, as they may be found in the references given. The notation p = (A)& is intro¬ 
duced in the sense mentioned. The general asymptotic solutions in the intervals I to V are: 

Intervals I and II: 

W ~ + Ci^i2^ f + C 2 ^ 2 1 ^ + C 2 C 22 €^ iS ). 

Intervals II and III: 

w ~ + D 2 c 2 l e u + D 2 c 22 *”* f )- 

Intervals III and IV: 

w ~ x F|"^(E 1 ^ 11 ^ f + E^er# + E 2< r 2l ^ f + E 2 ^22^). 

Intervals IV and V: 

w ~ x F£-*(F 3 / 11 ^ + + F \c 2 x e® + F 2 ^). 

In these formulae, C l7 C 2 , D 1? D 2 , E ls E 2 , F 1? F 2 are arbitrary constants to be determined by 
boundary conditions, imposed on the solutions. The values of T, £, c n , c l2l c 21) c 22 are fixed 
in the two consecutive intervals corresponding to each asymptotic solution. Thus we have: 

Interval I: 

If I X+^ME> [ i pa, 

Y = .- , f = -ip\ | i + </><& I %dz, constants C, 

| 1 + </><£> | * Jz T 

Ziri 8 tri lSrri ni 

c n = eli > £ is = « 12 , ^21 = « 12 , £ 22 = e 12 , ifo< arg p g it. 





On HtlVs Problems with Complex Parameters and a Real Periodic Function 285 


Interval IV: 



if (i+<^><i))teV 
\ip— ly «3 J 

(1+<£$)* * 

£=pf (1 +<j)(!>)$dz, 

constants E, 

T - (.+#*)»■ f-cj.o+#•)**. 

c n , ^i2? ^2i) ^22 being as in the interval II. 

constants F, 

Interval V: 



/f j 1 | %dz\> 

UP _ U J 

£ = - ip f | 1+ (j)® | $dz 9 

J a x 

constants F, 

| 1 + <£<£> | i 


c n> ^12? ^21? ^22 being as in the interval I. 


The value of \F is real in any interval, £ being either real or purely imaginary in consecutive 
intervals. 

From these asymptotic solutions of Hill's equation under the present conditions the 
characteristic values of A pertaining to Hill’s problem (2.2), (1.2) may be determined by 
means of equation (1.5). We have to determine the solutions wx and wn occurring in this 
equation, i.e. to determine the constants C, D, and E from the boundary conditions imposed 
on these solutions. We shall not put the somewhat cumbersome steps, leading to the final 
formulae for Wx(z^) and ze/n(%) (see equation (1.5)), on record and only state these results: 

<37 

2 Wx(Z'i) = e* p @ l+ pP* -f g~fyPi + pP* — ie “ *pP x “ 2 p&*pAi -j- ze “ * p P l + 2p & ~ p P* + e~* p P x ~ p P* j if o < arg p ^ ~, 

77 

2 Wx(Zj) ss* c " + PA — i 0 “ ipPi “ %PP s + P &2 “ P^a _)_ “ «*P/?i + %pPa ~ pP a ®’pA “ P& ? if — < arg p ~ 7 T, 

2 W ti( z i) 5=5 £ >lp ^ x+p ^* -)-/?“ % *p&+ p^a 4- ie ~ lp ^ x ~ 2p & + p ^ 2 — “ p ^ 2+2 p^ 3 -j- ^ ~ x ’p& “ p& ? if o < arg p = 

2 ^n(^i) = £~ lp ^ x+p ^* ze-i p Pi+pP*-%pPa +sp/?,,-p$, ^g-ipPi-pP*, if — < arg p'S.jr. 

2 

In these equations the abbreviations 

j8 x = [ | 1 + <£<£(£) | jS 2 « f (1 + <f>Q>(z))%dz, j8 3 = f | 1 +</><t>(£) | ^ (3.6) 

J Aj J A) •* <20 

have been used. From equations (3.5) it is simple to obtain the expression: 

^iOh) + z^nOh) =W = 2 ch pt£ 

determining the characteristic values of A. Our results are: 

77* 

W ^ ^pPi+pP* + e -ipPi+pPt e -ipPi-pPa, if 0 < arg p il —, 

7T 

*’ p & +p/?8 + <? 4p ^ x ” p ^ 2 + 0” ip ^~if — < argp < 7r. 

2 


( 3 - 7 ) 


( 3 - 8 ) 



4. Discussion of Characteristic Values Derived from 
Equations (3.3), ( 3 . 7 ) and (3.8) 

The discussion of these three equations will reveal some interesting facts on the asymptotic 
situation of complex characteristic values, similar to those proved previously in the case of real 



286 


M. J. 0 . Strutt 


parameters. We shall first consider equation (3.3) and the preceding equation. Using the 
notation 

__ 

p 2 = A, p = pi + ip 2 , and V1 + <f>$>{z)dz ~p t + ip 2 , 


we obtain: 


Hence, if | cr | =* 1, 


exp {± i(pxpi — P2P2) =F (p 1 i> 2 +p 2 /i)}=cr. 

— exp (±2 |pi> |)=a. 

p2 P 2 


( 4 -i) 


These characteristic values thus correspond to a fixed argument of p and hence of A. If | p | 
increases, the characteristic values corresponding to a particular value of <r, ( | or | ~ 1), are 
situated at regular intervals. The characteristic values corresponding to cr = 4 * 1 as well as to 
cr= -1 are each double, and | pp | —mr, n-o, ± 1, ±2, i 3, ... in this case. This 
situation is similar to the case when 1 has no zeros and <j> is real. Two consecutive 

completely periodic characteristic values as well as two consecutive half-periodic characteristic 
values coincide asymptotically. 

If O has two simple zeros within each period £, this will also be true for A-hy<I> and for 
1 +</><!>, if | y | >> | A |. We are thus automatically led to the second case, considered in § 3. 
Hence the situation of the characteristic values may asymptotically be obtained from equations 

7r 

(3.7) and (3.8) under these conditions. It will be assumed at first that o < arg p HL — Let 
p—pi -f- ip 2 , then p x and p 2 are both positive. Hence: 


2 ch pit, = W Pifii+PsPig* *G>i/f*+PiA) -j- ~ Pi fit +p» A») f (4*2) 


If jS x is comparable in magnitude to /} 2 , we may consider three sub-cases: (a) pi << p& 
( 3 ) p 2 << pi, and (<:) p x comparable to p a . In the sub-case (a) {p x << p 2 ), the first term of 
W is negligible in comparison to the other two terms. Hence: 

2 chp,£=W ^ 2 ^ i(1+<) cos {p 2 /? 2 ( 1 +8)}, | c | << 1, | S | << 1. (4*2#) 

The characteristic values p 2 pertaining to characteristic exponents pu such that ch pit, is com¬ 
parable with unity (z.e. | <7 | approximately or exactly unity) are, by the condition p 2 / 3 x » 1, 
approximately determined by p 2 /? 2 ( 1 +8) being near to an odd multiple of 77/2. Considering 
one particular multiple of 77/2, all the characteristic values corresponding to varying values 
of a are very close together, and the more so the larger p 2 ^ x is. Asymptotically, if p 2 j 3 i 
they approach the characteristic values connected with W-o and pertaining to cr= ± i, 
the period of the corresponding solutions being 4^, i.e. four times the fundamental period 
(as f 4 =+i). The distance Ap 2 between the two values of p 2 corresponding to adjacent 
characteristic values for o*» ± 1 or ch 1 is given by: 

j8 2 A p z ~ 2 e~M\ 


which is very small compared with unity. This is also true for | £Ap |. 

In the sub-case (b) the second term of W in equation (4.2) is approximately negligible, and 
we obtain: 

2 ch [xl =W ^ cos {/OxftCr +§)}, | e | « x, 18 | « x. (4.2 b) 

Hence similar conclusions may be drawn as in sub-case (a). 

m I n the sub-case (c) a new argument arises from the first two terms of W in equation (4.2) 
being negligible compared with the third one. In this case no characteristic values correspond¬ 
ing to | o | exactly or approximately equal to unity occur at all. 



On HilPs Problems with Complex Parameters and a Real Periodic Function 287 

As a second case, we may assume ft « ft and again obtain three sub-cases as above: 
(a) p x « pz, ( b ) p x » p 2 , and (c) p t comparable with p 2 . As a third case, we have ft » ft, 
again with the same three sub-cases. The results of discussing these cases and sub-cases are 
similar to those obtained in the first case mentioned. 

An exactly similar discussion has been applied to 77/2 < arg p ^ tt, corresponding to the 
second equation (3.8). The results are again similar to those of the previous cases. As a 
result of these discussions (eighteen sub-cases in all) we may formulate the 

Theorem 4.1 . The characteristic values, resulting from equations (3.7) and (3.8) corre¬ 
sponding to | a | nearly or exactly equal to unity, are clustered in intervals | £Ap | of very 
small width compared with unity, the width decreasing exponentially with increasing [ p |. 

From the above discussion the fact may be derived that the moduli of the characteristic 
values A corresponding to any particular value of a asymptotically increase proportionally 
to the squares of entire numbers in both the cases that 1 + has or has no zeros. This is an 
important item in the proof of the uniform convergence of some series expansions in § 8. 

These interesting facts about the asymptotic situation of characteristic values have been 
found for the Mathieu and Lame equations with real parameters by E. L. Ince [4, 5, 6, 7, 3], 
and empirically for the Mathieu equation with one purely imaginary parameter by H. P. 
Mulholland and S. Goldstein [11]. The author has found them for E. Meissner’s equation 
[19] with real and with complex parameters (the latter case still unpublished), and has proved 
them under general conditions for Hill’s equation with real parameters [15, 16]. Now this 
property has been shown to be valid for Hill’s equation with complex parameters under 
specified conditions. Thus this sequence may perhaps be said to have met its completion, 
comprising the particular cases considered hitherto. 


5. Series Expansions in Terms of Characteristic Functions 

The series expansion as considered here is based on A. L. Cauchy’s integral theorem and 
has been applied by H. Poincard [12]. A more general application was given by J. Tamarkine 
[21]. The case, in which 1 +<(>$) has zeros, is, however, probably beyond the existing 
applications. 

We consider a function K (z, t) of two variables satisfying the conditions: {a) K satisfies 
equation (2.2) with respect to z and to t as independent variables except at 0) K is a 

continuous function of both variables; ( c ) K satisfies equation (1.2); ( d ) K' is discontinuous at 
according to: 

r “1 

lim I K'(s, t) I = - 1. 

Hence K is similar to the Green’s function G of § 2, the differential equation being, however, 
slightly extended. This function K is a meromorphic function of A, having, as we shall see 
below, generally simple poles at the characteristic values of this parameter. We shall consider 
a simple closed curve in the complex A-plane, not touching any characteristic value of A. 
Let these characteristic values be arranged according to the order of their moduli and be thus 
numbered. Values with equal moduli may be numbered according to increasing arguments. 
A particular curve of the kind considered surrounds the characteristic values of suffix up to n 
(any integral number) and is therefore indicated as C n . According to A. L. Cauchy’s theorem 
we have: 


— 6 
ZTTZj o, 


K(*, k) 
, k - A 


dK = K(z, t, A) + 


y *) 

m~l Am “A 


(5.x) 


Here the contour-integral is taken along the entire curve C n in the complex fc-plane, whilst 
the value k is inserted for A in the function K(#, t } k) under the integral sign. The expression 
K(#, /, A) represents the residue of the integrand at k =A, and R w (£, t) is the residue of the 
integrand at k ~A m , a characteristic value of A inside C n . The value A is also inside this 



288 


M. /. 0 . Strutt 

contour. From equation (5.1) a series-expansion of K(s, t, A) may be derived: 


K(*, t, A)- 


m —1 


+ 


27T2 JOn 


( 5 * 2 ) 


We shall now expand the said closed curve so as to enclose more and more characteristic values 
A m . The shapes of the curves of this sequence may be chosen so as to yield simple analytical 
results. It will be shown that the contour-integral approaches zero if ^ 00 . Thus a 

convergent infinite series expansion for K(2, t) is obtained if the residues t) are known. 
The function K(2, t) may evidently be represented by equations exactly similar to (2.5), 
wi and ze>n being solutions of equation (2.2) with k instead of A, satisfying the conditions at 
z=*z 0 that were stated previously. From equations (2.5) we may conclude that the numerator 
as well as the denominator of K(#, /, k) are integral transcendental functions of k , and wjx 
being such functions by a well-known theorem of H. Poincar£. In the finite part of the 
complex /c-plane the function K hence has (generally simple) poles at /c =A w corresponding to 
the zeros of the denominator. 

From equation (3.2) we obtain asymptotically, if | k£ 2 | -> 00 5 subject to the condition that 
the imaginary part of z\/ k remains either above or below a fixed value if z is varied: 


Wi{z) ~ {t+^T cos (zV ~ k) ’ Wjj(z) ' 


sm 


(Z V k) 


V k{i + ^0(jk)}1{i + >(2 0 )}i’ 

remainder terms approaching zero asymptotically having been omitted. Furthermore, 


(5-3) 


+ 0 ~ {i+^) } icos +z $] 

w f2 + n~ Sin {Vk(z+zJI 

{1 + <j><J>(z)}l{i + c/><£>(z 0 )}lV k 

i) ~ ^n( z i) ~ cos (V KZ i), 

z i=\ {i+(f>0(z)}idz. 


( 54 ) 


If z is replaced by t , the variable z will be assumed to change into t. By this remark and (5.3), 
(5.4), equations (2.5) yield in the case | /c£ 2 | 00 the equations: 


K , t s _ _ sin {V k{z -1 )} - a sin {V/cfo 4 - z -1)} 

V«{a 2 - 2<r cos (VkZj) + i}{i + cj>Q(z)}i{i + 

if z ^ t and hence | z | 2; | 11; 

N a sin {V/c(t - z)} - sin {Vk(z 1 - z 4-1)} 

t. /c) ~~ O" ,— /— ^ 

V k{ cr 2 - 2<7 cos (V /cZ x ) + i}{i + (#)}£{ i + <£<!>(/)}£ 

if a / and hence | z | g. 111. 


( 5 * 5 ) 


If /c varies along a contour in the complex /c-plane, not touching any characteristic value 
/c=A w , the expression | V/cK | remains below a fixed finite positive bound M, if | /c£ 2 | 

This follows from |t-z|.< | z x | and | z x + z -t [ < | t x | if |z| < 111 , and from 
11 -z | S | z 1 | and | z x +t - z | S | z x | if | z | > 11 |, and from the properties of <D(s), if 



On Hill $ Problems with Complex Parameters and a Real Periodic Function 289 

z and t are real. The exponential functions with positive real part of the argument are always 
smaller in the numerator of K than in the denominator. For real values of k #A m the expression 
I VkK I remains also below a suitable fixed positive finite bound M, as may be concluded from 
(5.5). Hence it is obvious that the contour-integral of (5.2), assuming the contours, e.g., to be 
circles of radius | k |, approaches zero if | /c£ 2 | -> 00. Varying k along the said contour, we have 
o < arg Vk S it, and this complies exactly with the condition imposed on the asymptotic 
solutions (3.2), (5.3) and (5.4). 


6 . Asymptotic Evaluation of K (z, /, k) if i+<£® (*) has Zeros 

The above evaluation of KL and of the contour-integral in equation (5.2), in the simple case 
of 1 + <f><&(z) having no zeros, must be extended to the case in which the latter condition is not 
satisfied. We shall show that twenty-four sub-cases have to be examined in the latter case, 
reducing to twelve sub-cases upon proper handling. The function K(z, /, k ) being given by the 
two equations (2.5), two cases arise according as z is either smaller or larger than t (equality 
included in each one as a boundary case). If we vary k along a circular contour as quoted 
above, arg k varies from o to 277. If p 2 = k , hence arg p varies from o to 77. In this interval 
of arg p the asymptotic solutions, according to § 3, are in general different, if o < arg p ^ 77/2 
and if 77/2 < arg p 51 tt. Furthermore, different asymptotic solutions arise, according to 
equation (2.5), if a or t is in either one of the intervals I, II and III. Hence, in the case that 
z t we have twelve sub-cases as follows : — 


Sub-case No. 

argp 

Interval of z 

Interval of t 


^ 7T 



1 

0 < arg p < - 

I 

I 

2 


I 

II 

3 


I 

III 

4 

i> 

II 

II 

5 

if 

II 

III 

6 

if 

III 

III 


7T 



7 

- < arg p < ■n 

I 

I 

8 

if 

I 

II 

9 

a 

I 

III 

10 

a 

II 

II 

11 

tt 

II 

III 

12 

tt 

III 

III 


Here the intervals IV and V of § 3 come into use. If, e.g., z is in the interval III, z + £ is 
in V, etc. 

Twelve exactly similar sub-cases arise if z > t. From equations (2.5) it is seen, however, 
that the two expressions for K(a, t, k), if z is either smaller or larger than /, differ only in different 
coefficients a. If or is finite, these coefficients cannot materially alter the way in which K(a, t , k) 
approaches zero if | /c£ 2 | . Hence the discussion may be limited to the twelve sub¬ 

cases quoted above. If it is proved that | K(#, t, k)Vk j is bounded asymptotically in these 
twelve sub-cases, the same conclusion directly applies to the remaining twelve sub-cases 
corresponding to z > t. The complete formula pertaining to the said twelve sub-cases cover 
62 pages in small handwriting of a large-size writing-book in the author’s possession. My 
thanks are due to Mr N. S. Markus for assistance in their discussion. It would of course be 
far too tedious even to attempt to render an abstract of these formulae. It might be sufficient 
to record the handling of one sub-case in an abbreviated form and to quote the results of the 
examination of the remaining eleven sub-cases. 

Considering the sub-case No. 1 above, the numerator, or K (z, t, k), calls for an evaluation 
of w*(*+ Q, wz(z), wjjif), urate+Q, <*) and the intervals being I for a, /, and III 
for z + £, / +From the equations of § 3 we obtain: 



290 


Wi(z) ’ 


Wj(z) = i 


i +<5tO(%) 
1+$$)(£) 

1 + <ft<E>(*o) 


M. J. 0 . 5 /w// 


r 

if 2 is in the interval I; 

1/ ip/?i+p2s+p/a,l 1+ ^ <i l^ d,: . 


I+^0(2) | 

- »P/3i + pPs + P J111 + I 

+ <? 

. -ip2i~Pft+pJa. |l+$*l** . -ipSi+P/ 3 .-pJa 1 l 1 + 

— t tQ 

- *>£1 - p£a - pJJL 1 1 + I 
+ *> h 

if 2 is in the interval III; 
if z is in the interval I, [2tfi]=J | 1 + <f>Q>(z) \ %dz\ 


2 p 


_!_ C^PP.+Wn+P&VG + 

, ,{1+^0(2 0 )}{X+^(D(2)} ^ 

_(_ e P&* - ip/81 + pE^a*] _j_ gg - P& + a *P& "* PEM _|_ zgpPa ~ “ PE a a*0 

^ - PA ” ®P& “ pE^a®]^ ^ 

if 2 is in the interval III, the abbreviation [a 2 z] being used for 

[a 2 z] = f | x + | idz. } 

Ja» 


The other expressions are obtained from (6.x) by substitution of / for z. Thus the numerator 
N of K(£, /, k) in the sub-case No. i, using the abbreviations, 


(6.1) 


[a 0 z] = f | i+<f>$(z) | idz, 
Ja 0 

[foil = r | i + <^3>(£) | ^dz, 


becomes: 


N = — 

2 p 


— cre^^ 1 ++ pE to J —(jq~ Wx + pDmQ+pE^i] + ^ - pC®o*3 * 


{l + <£0(>)}{i 4-</><£>(/)} 

- 07<? “ + pIM - p[^i 3 „ a /£ - ip A - pE<m3+pIM 4 . ^ - pE*% 3+pC^il _ ~ pE*«i3 J, 


The denominator D of K(z, /, /c) is given by 

D —a 2 - aW +1, 


W being the first expression (3.8). Writing p — p r +spi 9 we have p r £ o and p* > o in the 
present sub-case No. 1. Assuming p r > o at first, the modulus of the numerator | N | if 
| p£ | ->» approaches either 


1 

2 P 


{1 + <f>®(z)}{ 1 


O' J ,tfP t -^i+PrE< 2 o*3 +PrE^i] 



On HilVs Problems with Complex Parameters and a Real Periodic Function 291 


or 


2 P 


{1 +cj)<&(z)}{i +<£€>(/)} 


i 


gPrl sa il - Prt ta il t 


Under these conditions | D | approaches 

| (j | fPiPi+PrPi' 

Hence | pK(z , /, k ) | in both cases is smaller than a positive upper bound A, which satisfies 


A>| 


j {1 +<f>®{z)}{i+<i><b(t)} 

If p r = o and | p£ | -> °o, the numerator N approaches 


= B. 


- - 2{sin (pi[a 0 z\) + cos (p*[><,«])} * {sin (pif/aj) + cos (pltaj)}, 

r 

and D approaches 

- 2 ore Pi ^ cos (pijS 2 ). 

Hence, the characteristic values corresponding to pij 3 2 equal to a multiple of 71/2 being 
excluded, we again find | pK(z, t , k) | to be smaller than a finite positive upper bound, if 

|p£|->00 . 

The examination of the other eleven sub-cases has been carried out along exactly similar 
lines. The result of this discussion is that | pK(z, t , k) [ is always bounded if | />£ [ —> 00 
in all twenty-four sub-cases corresponding to the condition that 1 + cjfi>(z) has two simple zeros 
in each period £. If z and t are complex and no zeros exist, the above formulae may be 
applied too. 


7. Evaluation of the Residues R Wi (z, t) in Equation (5.2) 

In the vicinity of a pole k =A w the function K (z, t, k) is approximately represented by the 
expression: 

R-mfo l) 

As K (z, t, k ) satisfies equation (2.2) with k ~A m , this is also true for R m (z, t) for all values 
of z and / except z = i. As K (z } t } k) satisfies equations (1.2) with respect to z (K being sub¬ 
stituted for w) and equations (1,3) with respect to t (K being substituted for w # and t for z), 
the function R m (jz, t) satisfies these same conditions. It may be proved that the solution of 
equation (2.2) with either the equations (1.2) or (1.3) is unique but for an arbitrary multiplier, 
if A coincides with a simple characteristic value A m . Denoting a solution of (2.2) and (1.2) 
with A =A m by w m (z), and a solution of (2.2) and (1.3) with A =A m and / instead of 2 by wZ(t) } 
these being characteristic functions of the said problems, corresponding to the characteristic 
value A m , we hence have, in this case, 

Rm = - (7. x) 

r m indicating a multiplier independent of 5 or t to be determined below. In the case of a 
double characteristic value A m> two corresponding and linearly independent characteristic 
functions of each problem exist. This case can only arise if <7= ± 1, as was stated in § 1, and 
the corresponding two problems are hence self-adjoint, entailing that w(z) = w*(z). In this 
case the characteristic values A m are all real if 1 has zeros and the expansion problem 
is considerably simplified, so as to justify its dismissal from the present discussion, referring to 
previous publications [17, 18]. 

We now proceed with the determination of r m . To effect this, Green’s function G(z, t), 
resulting from K (z, t, A) in the case that A = o is considered. This function G(z, t) satisfies 
the same conditions as K, but with A ~ o. Hence its convergent infinite series expansion may 



2 9 2 


M ../. 0 . Strutt 
be derived from equation (5.2), taking A *0: 

R w (#j V r m w m(?)W m (t) 

T—=2j-a-■ 

m = l m = l 


( 7 - 2 ) 


It will be assumed that this series is uniformly convergent. This proposition will be proved 
below. Multiplying both sides of (7.2) by +<ffi>(Q} and integrating with respect to t 

from ttot+t, taking into account the bi-orthogonality resulting from (1.4): 

(v+f 

+ 4>$>(t)}dt = o, if m x * (7.3) 

J Z 

we obtain: 

™+f r ro a/ m (ar)f + </4>(/)}<A 

<?(*, i)w m {t){l +<t><b{t)}df = - ” - - -. ( 7 . 4 ) 

J z J\ m 


The left side of (7.4) is equal to w m (z)/A m by virtue of the linear homogeneous integral equation, 
equivalent to Hill’s problem (2.2), (1.2). Hence the expression: 


1 




( 7 - 5 ) 


results. With the aid of this expression the convergent infinite series expansion, resulting 
from (5.2), may be completed: 

“. r m wJjz)wZ,(t) 




m-l 


A, Iffl Ah 


( 7 - 6 ) 


The equations (7.2), (7.6) and (7.5) will prove of great value in the solution of several sub- 
problems, derived from Hill’s problem. 


8 . Uniform and Absolute Convergence of the Series Expansions Obtained 

As a first step to the application of equations (7.2), (7.6) and (7,5) it is necessary to prove 
the uniform convergence of the infinite series involved within each period z 0 « * = % + £> 

taking into account the conditions of Hill’s problem. We shall discuss this convergence in 
the two cases: (a) 1 + <j><$(z) has no zeros, ( 3 ) 1 -}has zeros. 

In the case (a), the characteristic values A w and the corresponding characteristic functions 
are given in equations (3.3) and (3.2). We shall show that the transformed characteristic 
function | w(z) |, apart from its arbitrary multiplier A, and hence the corresponding character¬ 
istic function | w m (z) | too, are finite for any suffix m within the said interval of z and of z, if or 
and hence fi is finite. This property becomes obvious if we insert (3.3) in (3.2) and thereupon 
apply the transformation necessary in order to obtain w m (z). An upper bound may be in¬ 
dicated for | w m (z) | within the said interval of z, this bound being independent of the suffix m . 
The multiplier A need not be considered further, as A occurs equally in both the numerator 
and the denominator of each term of the infinite series (7.2) and (7.6). Hence only the value 
of r m remains to be considered. Making use of the transformation of w m to w m and of z to z, 
equation (7.5) may be written (omitting the multipliers A as stated above): 

1 r Zi r z * 

-—\ w m (z)w l(z)dz = \ dz=z 1 -z 0 . (8.x) 

r m J Z 0 J Zo 

This value is also independent of m . If A in equation (7.6) is finite (this assumption being 
essential) and m —> co, the absolute values of the consecutive terms are, from a certain fixed 
value of m onwards, smaller than M/w 2 , where M is a positive finite bound, independent of m. 
Thus it is shown that this series is absolutely and uniformly convergent in the present case {a). 

In the case ( b ) we may conclude from the discussion of characteristic values by equation 
(4.2) that | A m | is approximately proportional to m 2 for any finite value of o* (and hence of ft) 
from a certain fixed value of m onwards. We proceed to discuss the value of | w m (z) | if 



On Hi IPs Problems with Complex Parameters and a Real Periodic Function 293 

m —> 00 . As no single representation of solutions of equation (2.2) is possible in this case 
throughout one period, the discussion has to be applied to the different intervals I to III 
quoted in § 3. We have represented a characteristic function w m (z) by a linear combination 
of wi(z) and Wjj(z), inserting A m for p 2 , with two constant multipliers. The ratio of these 
multipliers has been determined from the conditions (1.2), the resulting two linear homo¬ 
geneous equations for these multipliers being consistent on account of the value of p 2 inserted. 
In this way expressions with one still arbitrary multiplier in front were obtained for w m {z) and 
for Wm(z), which, on account of wi(z) and wn(z) being known throughout the said intervals 
I, II and III (see, e.g ., eqs. (6.1)), are known there too. Apart from the said arbitrary multiplier 
(denoted by Bj), thesej^pressions for | w m {z) ( and | w!^{z) | are bounded throughout the said 
intervals for any finite value of a independently of m. Considering equations (7.2) and (7.6), 
the multiplier B x occurs equally in both the denominator and the numerator and hence cancels 
out. Disregarding it further, the value of r m has been shown by insertion of the said 
expressions for w, m (z) and w^(z) to be bounded independently of m . Hence the same con¬ 
clusion applies to the series (7.2) and (7.6) in the present case (a) as in the case (a) above. Thus 
the assumption made in § 7 in order to obtain equation (7.5) is justified. The equations (7.6) 
and (7.2) may be combined so as to obtain an equation, similar to a well-known equation in 
the case of real characteristic values and self-adjoint problems: 


, a \ .s A ^ r m w m (z)wl(f) 

K(Z ’ A) A m (A“-XT 


(8.2) 


The series on the right may be indicated as the meromorphic part of K(*, A) with respect to 
A, and is absolutely and uniformly convergent within the said interval. 


9. Series Expansions of Arbitrary Functions and Solution of Hill's 
Inhomogeneous Problems 


We assume g{t) to be an integrable real function of the real variable / and obtain f(z) by 
the operation: 

/(*) -\ z G(z, t){ 1 +<t>®(t)}g(t)dt. (9.1) 

This function f(z) satisfies the conditions (1.2) as G(z, t) does so with respect to z . Substitution 
of the series expansion (7.2) for G(z, t) in (9.1) yields: 


/(*) = 2 c ^ w M), 

ra — 1 


+ 4 >®(t)}g(t)di. j 


(9-2) 


We have hence obtained an absolutely and uniformly convergent series expansion of a function 
/(#), representable by (9.1) but otherwise arbitrary, as is g(t) (apart from being integrable). 
In consequence of the uniform convergence of (9.2) we obtain the relation: 

ps+£ n 

lim | f{z) - V c v&> m(s) | a dz -> o. (9.3) 

n^-ccJz m=1 

This relation testifies to the closed and complete character of the infinite bi-orthogonal set 
of characteristic functions w m {z) and of their adjoint characteristic functions z£>*(z). This 
property has previously been proved in the case of real parameters and | o* | — i [17, 18]. 

The above series expansion of an arbitrary function may be applied to the solution of 
Hill's inhomogeneous problem, consisting of the differential equation, 

d 2 w 

+ MAo +r o i>(»} + wA{i + ^O(z)} =/(«){ I + <£ 0 (z)}, (94) 


or of the equation, arising from (2.3), with this same expression on the right, and of the 



294 J' G* StTUtt 

boundary conditions (1.2). We assume a uniformly convergent series expansion, 

00 

w(z)='Zja m w m (z) (9.5) 

m—l 

for the solution w(z) of these inhomogeneous problems, w m (z) being characteristic functions 
of the homogeneous problem arising from the above one, if the expression on the right side 
is zero. The expansion of f(z) being given by equation (9.2), we obtain, on substitution 
into (9.4), 

“"-Art- < s - 6 > 

From (9.6) we may conclude that the expansion (9.5) is indeed uniformly and absolutely 
convergent, the expansion (9.2) being so, if A is finite. By the application of (9.5), (9.6) a 
variety of problems in applied mathematics may be solved. 

10. Applications of the Series Expansions 

Besides the series expansions (7.2), (7.5) for G(z, /), we shall make use of the adjoint 
expansion, 

£*0, t)= 2 j -A-> O' 1 ) 

m = l 

and of the bi-orthogonal relation (7.3). Firstly, we calculate the integral, 

H>, *)G*(z, t){i+<ft>(z)}{x+<t>(t>(i)}dz di, 

00 

by insertion of the expansions (7.2), (7.5), (10.1), and find 2 A as a r e»ult. Continuing 
this process, we use the iterated functions: 1 


0 - J ^ G(*> h) h){ 1 

^) — G 1 (z, +<56$(4)}^2) 

r*+t 

G n (z, t) =j 2 <?„_*(*, 4 )C-x(A 4)0 +W 4 )}di n 

r z +£ 

G n (z, t) = J ^ Gn-i t n ) G n -i (/, / w ){i + (/ n 


Insertion of (7.2), (7.5), (10.1) yields 


+ £ CO 

Gn - i0, t ) i x + <£$(>)}{1 + <£® (/)}ak dt=y\ —, a n ~ 2 W . 

z '""T. A n n 


F J = 2tT„ 2 2 -7- 
1 A” T Al n 


lim 2 

x a;* A a x n 


and 



On HilPs Problems with Complex Parameters and a Real Periodic Function 295 

A x denoting the characteristic value of smallest modulus and k x the multiplicity of this value. 
Hence: 


I 


1 

kfn 

= lim 

F a n 

n 

iH 

<! 




Thus we have obtained an upper bound for A x by iteration. By the asymptotic formulse for 
the characteristic values k x is always finite. We have: 


00 1*1 T 

lim T—=V—. 
1 tiZ iM” 


Besides the iterated functions G n we consider: 


(10.3) 


H»(*, 4 =J G n (t, 4){i + ^>(4)}^, 

*.-n 


Now 


Hence, if k x = 1, 


£„(*, 4 G*(», *){i dt -2 TTTi- 


„ , A 4k r m wjjs)w m (t) } 

hm G n (z, 4 = 2 --> 

*-►« m=l A m B 

lim H„(2, 4 = 2,--r+I-» 

»-►* m=l A m n 

except at a zero of a/ m or of zu*. 

1 H n (z, t) F„ 0 

t~= lim —t = hm -=-• 


(10.4) 


(io-S) 


Thus we obtain the first complex characteristic value in this case. Multiplication by 
A“» or AJ» +1 of the first or the second equation (10.4) then yields the corresponding 
complex characteristic functions too. 

In this case, h x = i, we now consider: 


t) = G{z , /) - 


ze/i(z)ze/*(4 

Ax 


With the aid of x G(z, t) the processes of iteration as described above may be carried out in the 
same way as with G(z, t). The resulting formulse yield the second characteristic value and the 
corresponding functions if its multiplicity h^ — i: 


1 v iH n (z f t) v iFfl 0 

T-= 1m ~ r hm 

•**•2 n->oo v^n\ z i l ) n-»oo 
n , .s ^ 

lun 4=2 - — a - 

»“*■«* m—2 A m 


This process may be continued in order to obtain the third characteristic value, etc. 


(10.6) 



296 On Hill's Problems with Complex Parameters and a Real Periodic Function 


REFERENCES TO LITERATURE 

1. Besicovitch, A. S., 1932. Almost-periodic functions. Cambridge Univ. Press. 

2. Elliott, W. W., 1928, 1929. “Green’s functions for differential systems containing a para¬ 

meter”, Amer. Journ. Math., L (1928), 243-258 ; LI (1929), 397-416. 

3. Erd£lyi, A., 1941. “On Lame functions”, Phil . Mag., xxxi, 123-130. 

4. Ince, E. L., 1925-1927. “Researches into the characteristic numbers of the Mathieu Equation”. 

Part I, Proc . Roy. Soc. Edin., XLVI (1925), 20-29. Part II, ibid., XLYI (1926), 316-322. 
Part III, ibid., XLVII (1927), 294-301. 

5. -, 1927. “The Mathieu equation with numerically large parameters”, Journ. London Math. 

Soc., II, 46-50. 

6. -, 1940. “The periodic Lame functions”, Proc. Roy. Soc. Edin., LX, 47-63. 

7 - -, 1940. “Further investigations into the periodic Lame functions, ibid., LX, 83-89. 

8. Langer, R. E., 1934. “The asymptotic solutions of certain linear ordinary differential equations 
of the second order”, Trans. Amer. Math. Soc., XXXVI, 90-106. 

9* -> s 934‘ “The asymptotic solutions of ordinary linear differential equations of the second 

order, with special reference to the Stokes’ phenomenon”, Bull. Math. Soc., XL, 545-582. 

10 . ->, 1935 - “On the asymptotic solution of ordinary differential equations, with reference to the 

Stokes’s phenomenon about a singular point”, Trans. Amer. Math. Soc., XXXVII, 397-416. 

11. Mulholland, H. P., and Goldstein, S., 1929. “The characteristic numbers of the Mathieu 

equation with purely imaginary parameter”, Phil. Mag., VIII, 834-840. 

12. PoiNCARlL, H., 1895. Analytic theory of the propagation of heat (French), Paris. 

13. Strutt, M. J. O., 1943. “Bounds for the characteristic parameter-values corresponding to 

problems of Hill. Part I. Characteristic values of smallest moduli”, Proc. Roy. Acad. 
Amsterdam, LII, 83-90. 

14. -, 1943. Idem. Part IL “Characteristic values of any order” (in Dutch), ibid., LII, 97-104. 

* 5 - -» * 943 * “Curves of characteristic parameter-values corresponding to problems of Hill. 

Part I. General character of the curves” (in Dutch), ibid., LII, 153-162. 

j6. -■, 1943. Idem. Part II. “Asymptotic character of the curves” (in Dutch), ibid., LII, 212-222. 

J 7 - -, 1943 - “Characteristic functions corresponding to problems of Hill. Part I. Completeness 

of the sets of periodic and of almost-periodic characteristic functions” (in Dutch), ibid., LII, 
488—49b. 

18. -, 1943. Idem. Part II. “Expansion formulae in series of periodic and of almost-periodic 

characteristic functions” (in Dutch), ibid., LII, 584-591. 

19. , 1932. Lamps, Mathieu's and related functions in physics and engineering (booklet), 
Springer, Berlin. 

20. -1934. “Hill’s differential equation in the complex domain” (in German), Nieuw Archief 

voor Wiskunde, XVIII, 31 — 55 * Abstract in Comptes Rendus Paris Acad., CXCVIII, 1008-1010. 

21. Tamarkine, J., 1912. “On some points of the theory of ordinary linear differential equations 

and on the generalization of Fourier’s series” (French), Rend, del Circolo Mat. di Palermo , 
xxxiv, 345-382. 

22. Whittaker, E. T., and Watson, G. N., 1940* A course of modern analysis, Cambridge. 

23. Strutt, M. J. O., 1944. “Real eigen-values of Hill’s problems of the second order” (German), 

Math. Zeits., XLIX, 593-643. 


{Issued separately May 12, 1948) 



( 297 ) 


XXXI.— Thermal Diffusion in some Aqueous Solutions. By Archibald C. Docherty 
and Mowbray Ritchie, Chemistry Department, University of Edinburgh. 
(With Five Text-figures.) 

(MS. received March 25, 1946. Read July 1, 1946) 

The study of thermal diffusion in gases has recently been so far developed that a satisfactory 
theory can in general be applied to explain and predict separations in gaseous mixtures. The 
results have been of value from the practical point of view of simple separation; at the same 
time the method can be used in the elucidation of the laws of interaction between colliding 
gas molecules and of the departures from the postulates of simple kinetic theory. On the 
other hand, thermal diffusion theory as applied to liquid mixtures is much more unsatisfactory, 
the complexity of the liquid system being such as to render uncertain the prediction of sign 
and magnitude of separation. 

When equilibrium conditions are considered, an appreciable separation of two liquids 
involves generally large changes in viscosity, density, etc., such changes making theoretical 
interpretation difficult. The present paper gives some measurements of initial rates of 
separation as near as possible to zero time, when the concentrations and other properties of 
the solution are definitely known. The experimental method was an adaptation of the 
* 1 cascade” method originally applied to gases by Clusius and Dickel (1938), and extended 
to liquids by, inter alia , Korsching and Wirtz (1940). The work was begun in the first instance 
with a view to the possible separation of sugars in aqueous solution. This class of substance 
was regarded as possessing properties which would, facilitate such a study, viz. the non- 
electrolytic nature, the “normality” of osmotic pressure values for dilute solution, the general 
similarity throughout the group of possible hydroxyl and hydrogen bonding with water, the 
range of solubility, the non-volatility, and the availability of such data as density and viscosity. 
The results given refer to the solutes glucose, sucrose, xylose, raffinose; glycerol and acetone 
were included for comparison. 


Experimental Procedure 

The Thermal Diffusion Apparatus .—The thermal diffusion column consisted essentially 
of two concentric thin-glass tubes approx. 150 cm. long, of approx. 1 cm. diameter, so sealed 
together at the lower end that the annular space between the tubes gave an average radial gap 
of o-68 mm. Initial attempts were made to ensure an exactly constant gap along the tube 
length by suitable spacing materials; this, however, was found difficult practically, more 
particularly as it was necessary to use as thin glass as possible to ensure that the temperatur 
gradient applied was not largely lost in the glass walls themselves. The use of spacing 
material was finally dispensed with and reliance placed on the selection of suitable tubes 
correctly sealed into positions which would not alter from experiment to experiment. To 
facilitate filling and to avoid breakage in operation the upper ends of the tubes were not sealed 
but held in position by rubber. This arrangement gave satisfactorily reproducible results. 

The solution to be investigated was introduced into the annular space by a fine capillary 
and stopcock at the extreme lower end of the column, this system being also used for the with¬ 
drawal of samples for analysis. The capillary was as short as possible and was sealed on in 
such a way as to leave unaltered the radial separation between the tubes. An outer jacket of 
copper lagged by asbestos was so attached as to cover as much as possible of the column, 
particularly at the lower end; the capillary projected immediately below the heating jacket. 
The temperature gradient was applied by passing cold water of known temperature rapidly 
through the inner tube, and steam or hot water of known temperature rapidly through the 
copper jacket. The difference in temperature for any one stream between entrance and exit 
was never more than a few degrees. As a matter of convenience, both heating and cooling 
streams entered at the column top. 



298 Archibald C. Bocherty and Mowbray Ritchie 

It was of importance to determine how accurately the effective mean temperature of the 
solution in the annular space could be estimated from the measured mean temperatures of the 
heating and cooling streams, and how closely the measured temperature gradient corresponded 
to the actual gradient across the solution. To this end a balancing vertical column was 
temporarily attached to the capillary stopcock and the annular space and this column filled 
with water to a fixed level just above the top of the heating jacket. By means of the relationship 
Pihi=p 2 h2> the mean density of the water in the diffusion column was determined from the 
difference in column lengths for different applied temperature gradients. Results are given 
in Table I. 


Table I 


Exp. 

Measured Temperatures °C. 

Inner Tube Outer Tube 

Mean 

Temp. 

Diff. 
in h. 
(cm.) 

p 

Calc. 

Calc. 

Mean 

Temp. 

Top 

Bottom 

Av. 

Top 

Bottom 

Av. 

1 

7'0 

7 *o 

7 -o 

7 *o 

7 -o 

7 *o 

7 *o 

0*00 

0*99993 

7 *o 

2 

7 *° 

7*0 

7 *o 

23-0 

23*0 

23*0 

15*0 

0*15 

0*99865 

17*8 

3 

7 -o 

7*0 

7*0 

29*0 

29*0 

29*0 

18*o 

0*25 

0-99798 

21*2 

4 

7 *o 

7*5 

7*3 

44 *o 

44 *o 

44 *o 

25*7 

o *35 

0-99720 

24*5 

5 

7-0 

7*5 

7*3 

51-0 

51*0 

51*0 

29*2 

o *45 

0-99643 

27*4 

6 

7 *o 

8-o 

7*5 

60‘0 

6o*o 

6o-o 

33'8 

0*65 

0*99488 

32*5 

7 

7 *o 

8*5 

7*8 

88-o 

83*0 

85-5 

467 

1*40 

0*98913 

40*8 

8 

7-0 

9 *° 

8-o 

100*0 

100*0 

100-0 

54 -o 

2*20 

0*98307 

60*0 


It was concluded from columns 8 and 11 that for the present purposes the mean solution 
temperature could be satisfactorily obtained from the mean temperatures of the heating and 
cooling streams. The divergences from these values as shown by the density method above 
were connected with the relative thicknesses of the glass tube walls and with the rates of flow, 
and could not with the present experimental arrangements be made appreciably smaller. 

The main separation between the tubes was determined by running water from a burette 
into the dry diffusion column and noting the volume required to fill a measured height of 
annular space. The mean separation was thus o*68 mm. The total volume of water or 
solution corresponding to a “full” tube was approx. 28 ml. 

Methods of Analysis .—In all cases the change in concentration at the foot of the tube was 
determined from a 0*5 ml. sample withdrawn from the column after the rejection of a similar 
volume corresponding to the volume of the capillary tube and the portion of the diffusion 
column not directly subjected to the temperature gradient. Preliminary tests on successive 
samples at successive time intervals showed a straight line concentration-time relationship 
for the early time intervals; later results were calculated from two sets of data only. In all 
cases the time of diffusion was taken as small as was consistent with accurate sample analysis. 

In any experiment, the cooling stream was first turned on and the solution in question 
then introduced. At zero time the heating stream was applied; as judged by the expansion 
of the column, a few seconds only were required to reach temperature equilibrium. The 
sample was removed at noted time intervals under these steady conditions, the temperatures 
being noted. Normally the duration of such intervals was of the order 5 to 30 minutes. 
Changes in concentration were expressed in gm. molecules per litre (at 15 0 C.) per min. and 
were obtained by the following methods of analysis. In all cases the methods were checked 
on control samples; further control experiments in which solutions were maintained at ioo° C. 
for times similar to those involved in actual diffusion runs showed ho detectable concentration 
changes such as might be expected by sugar hydrolysis. 

Solutions containing one solute, e.g. raffinose, xylose, sucrose, glucose, glycerol, were 
examined by a constant temperature Pulfrich refractometer on the basis of previous calibration 
in sodium light of solutions of known concentrations. In the case of acetone the method used 
was that of Munro (1936), involving the quantitative oxidation of acetone by alkaline hypoiodite, 
acidification, and titration of unused iodine by thiosulphate. 




Thermal Diffusion in some Aqueous Solutions 299 

For solutions containing both sucrose and glucose, a Hilger polarimeter with mercury 
green light was used to give a direct reading of a suitably diluted sample, which was then 
inverted by concentrated hydrochloric acid, the change of rotation being proportional to the 
sucrose present. The glucose was then determined by difference. 

For solutions containing both sucrose and glycerol the method was based on that of Fulmer, 
Hickey and Underkofler (1940), which involves the determination of glucose by the copper 
titration procedure of Shaffer and Somogyi (1933), in conjunction with the oxidation of a 
second similar sample by ceric sulphate. Both glucose and glycerol are oxidised, the glycerol 
being calculated from the amount of ceric sulphate used by the glucose. In order to adapt 
the method to the determination of sucrose, the sucrose was first inverted at 70° C. by sulphuric 
acid. 

In solutions containing both acetone and glycerol, acetone was determined by the alkaline 
hypoiodite method above, control experiments showing that glycerol at the concentrations 
in question did not interfere with the determination. Ceric sulphate was used to oxidise both 
glycerol and acetone, the glycerol being determined by difference. 

Experimental Results and Discussion 

Debye (1939), in a theoretical discussion of the thermal diffusion apparatus of two parallel 
vertical plates with a temperature difference of T°, at a distance apart a and of height k, 
found the value of a for which maximum separation at equilibrium should be obtained, by the 
relation a 3 = 6oorjD/BpgT, where rj is the solvent viscosity, D the coefficient of (ordinary) 
diffusion, B the coefficient of cubic expansion, p the density, and g the acceleration due to 
gravity. With appropriate values for aqueous solutions at 6o° C., a is approx. o*i mm. For 
such a maximum separation condition Debye further considered the separation in a time 



shorter than that required to reach equilibrium; when the heated volume is very much greater 
than the volume of the reservoir, the reservoir comprising all liquid not subjected to the tempera¬ 
ture gradient, the separation is given by a relation into which the height of the column does 
not enter. The separation is then independent of the column length. Since this holds, 
strictly speaking, for the above value of <z = o-i mm., it was of interest first to ascertain if such 
independence also held for the a value of o-68 mm. of the experimental arrangement. 

The change in concentration of a glucose solution of 1-726 gm. mol. per litre was measured 
for various lengths of column liquid for various mean temperatures and gradients. The time 
of each experiment was 30 minutes. Results are shown graphically in fig. 1, where the increase 
in glucose concentration at the column foot is plotted against the length of column liquid. 

Curve I refers to gradient 50° C., mean temperature 30° C., curve II to gradient 51 0 C., 
mean temperature 75 0 C., and curve III to gradient 95 0 C., mean temperature 53 0 C. It will 
be observed that for these conditions, representing the maximum variation which could be 



300 Archibald C, Docherty and Mowbray Ritchie 

practically applied, the rate of separation is independent of the column length for lengths 
greater than ioo cm. Thus under the general experimental conditions of a full column of 
130 cm. the rate of separation will not be directly affected by the small reduction in length 
due to the removal of samples for analysis; further, all experiments are comparable in that 
they lie in a region for which the column length is immaterial. As will be indicated later 
(p. 301), the relative positions of these curves are in general agreement with the effects of 
alteration of mean temperature and temperature gradient. 

Single-Solute Solutions .—Fig. 2 shows the effect of variation of concentration of a series 
of single solutes, the operating conditions being otherwise the same throughout, viz. mean 
temperature 6o° C., temperature gradient 8o° C., total solution volume 28*2 ml. 



Fig. 2. 

The initial separation rate AX x in gm. mol. per litre per min. is plotted against the initial 
concentration. In all cases the concentration increased at the column foot. All the curves 
show a maximum at an intermediate concentration: the greater the solute molecular weight 
the less the molar concentration at which the maximum occurs. At low concentrations it 
would appear that the greater the molecular weight the greater is the initial rate of separation, 
although for such conditions the curves for glucose and sucrose differ only very slightly. The 
relative position of the curves in general suggested that for some concentrations partial solute 
separation might be practicable and some two-solute solutions were thus examined. 

Two-Solute Solutions .—(a) Sucrose-Glucose. —Results are given for a solution containing 
1*00 gm. mol. of each solute. Operating conditions were as before. In Table II F 0 and F s 
represent the. factors for glucose and sucrose increases in molecular concentrations. The 
relative separation is very small though in a direction according to expectations based on 
fig. 2. 


Table II 


e (min.) 

F, 

F. 

F,/F, 

0 

1*00 

I *00 

1*00 

2 

1*02 

1*00 

0*98 

5 

1*03 

1*01 

0*98 

10 

1*05 

1-03 

0*98 

20 

1*08 

1*03 

o *95 

30 

1*10 

1*05 

o* 9 S 


The experiments were repeated for different concentrations of glucose and sucrose; F s /F^ 
ratios are given in Table III for the various times of experiment. 

Again the separations are in general small and in certain instances contrary to expectation. 
It is obvious that single-solute solution results cannot be used to predict such separations; 
certain other properties of the solution must be considered. 



Thermal Diffusion in some Aqueous Solutions 301 

(b) Sucrose-Glycerol. —A solution containing per litre 0*90 gm. mol. sucrose and 0*97 gm. 
mol. glycerol was subjected to the same conditions of thermal diffusion for 30 min. At the 
end of that time the concentration of sucrose had increased by the factor 1-22, while that of 
the glycerol increased by the factor 1*16; relative separation was again only of the order 
5 per cent. 

Table III 


Glucose 

Sucrose 


Time (min.) 


Cone. 

Cone. 

" 0 

5 

10 

20 

120^ 

2*00 

1*00 

1*00 

o -99 

I -00 

o *99 

o *99 

2*00 

0*50 

I -00 

1-03 

1*05 

.. 

1*03 

3’00 

0-50 

I -00 

U15 

1*12 

1*21 

i*i5 

0*50 

1*00 

1*00 

1*00 

o -93 


0*96 


Temperature Variation 

(a) Variation of Mean Temperature .—The temperature gradient was kept constant at 40° C. 
and the mean temperature altered for single-solute solutions of glucose, sucrose, glycerol, 
all of 30 gm. per 100 ml. Results are shown graphically in fig. 3. 



Fig. 3. 

A rapid increase in initial rate appears for increased mean temperature; the continuous 
curves have been drawn on the basis of a general formula derived later (see following paper). 

(b) Variation of Temperature Gradient. —A glucose solution of 30 gm. per 100 ml. was here 
employed, and the temperature gradient varied for constant mean temperatures. Practical 
adjustment of such temperatures was somewhat difficult, but results are shown in fig. 4, where 
the numbers at each experimental point refer to the actual mean temperature, and the 
continuous curves I, II and III are calculated curves for mean temperatures 40°, 6o° and 8o° C. 
respectively. Again increase in temperature gradient increases the separation rate; the 
relationship is not a linear one. 

These results are in general correspondence with the curves of fig. 1. In curve I of fig. 1 
gradient and mean temperature are both low; in curve II the gradient is the same, and the 
marked increase in rate is due to the increased mean temperature; in curve III the reduction 
of mean temperature is compensated by the increase in temperature gradient. 

The Sign of the Separation. —In all the previous experiments with raffinose, sucrose, 
glucose, xylose and glycerol, the separations have been positive in the sense that an increase 
in concentration was found at the lower end of the column. The literature (see Korsching and 
Wirtz, 1940; Gillespie and Breck, 1941) shows that there is a general correspondence between 



302 Archibald C. Docherty and Mowbray Ritchie 

the sign of the separation and the density of the solution as compared with the solvent density, 
although exceptions appear to occur. All the above give solutions of density greater than water 
itself, and the results are thus in agreement with this generalisation. The case of acetone was 
then’considered; this substance will be different in its relation to water by the absence of 
appreciable hydroxyl or hydrogen bonding as compared with the sugars, but may be compared 
with glycerol in that the density of water is almost intermediate between the two, and the 
apparent molecular volumes as calculated directly from the densities and the molecular 
weights are approximately the same. 

Because of the low boiling-point of acetone a lower operating mean temperature and 
gradient were necessary. The relation between initial rates of separation and concentration 
was determined for a mean temperature of 30° and a gradient of 40° C. Results are shown in 
fig. 5; in all experiments a decrease in concentration was observed at the column foot. 

Apart from the change in sign, the curve is of the same general type as before, although 
the separation at low concentrations appears abnormally small. 



TEMPERATURE GRADIENT °C 
Fig. 4. 

It was first necessary to inquire whether the experimental method involving convection 
did not introduce a factor independent of the actual thermal diffusion. In such convection 
as determined by gravity the tendency will always be for the liquid of less density to accumulate 
at the top. There was therefore the possibility that the real movement of acetone was to the 
colder region as with glycerol, but the resultant density change in the measured time of experi¬ 
ment might cause a reversal of convection. 

Early experiments by Wereide (1914) by the Soret method involving no “ cascade” convection 
gave a separation to the cold region of aqueous solutions of glycerol, but no observed separation 
of acetone or of ethyl and methyl alcohols, all of which latter give solutions of less density than 
water. This might thus be the result of two competing influences, thermal diffusion to the 
cold region balanced by convection in the system. An attempt to repeat such a Soret-type 
experiment was made by means of an inverted V-tube containing 50 per cent, by volume of 
acetone. One limb was kept at 50° and the other at 5 0 C.; after 66 hours, examination by a 
Zeiss interferometer indicated a very slight migration of acetone to the hot region. 

A return was made to the convection column with three methods of approach: 

(a) The time of diffusion was reduced to a minimum consistent with measurement of 
concentration change, so that any density change would be very small. The interferometer, 
however, showed that even after only 30 seconds with a 10 per cent, volume solution some 
acetone had migrated from the foot of the column. 

(b) The acetone concentration was reduced as far as possible. The rate of separation 



Thermal Diffusion in some Aqueous Solutions 303 

then was very small, but a 1 per cent, volume solution after 30 seconds still showed a detectable 
reduction of acetone concentration at the foot of the column. 

(c) A ternary mixture of water, glycerol and acetone was examined. If the acetone does 
in fact move to the cold region, the simultaneous migration of glycerol would bring the density 
up to a value such that normal convection would take place. For a solution containing 0-2965 
gm. mol. glycerol and 0-0962 gm. mol. acetone, the concentrations after 10 min. at the column 
foot were 0-405 and 0-0935 respectively; for solutions containing 0*165 and 0-1375 the re¬ 
sultant concentrations were 0-295 and 0-126. It was thus concluded that for the concentration 
and temperature conditions in question, acetone does migrate to the hot region apart from any 
convection complications. This is then in agreement with recent work of Wirtz (1943), 
although De Groot (1945) reports a separation to the cold region for certain unspecified 
conditions. 

From the practical point of view, the previous results indicate that an apparatus of the 
present simplicity cannot conveniently be employed to separate dissolved sugar solutes in 
such aqueous solutions. An attempt to correlate and explain such results theoretically is 



ACETONE CONCENTRATION (GRM. MOL. PER LITRE) 

Fig. 5. 

presented in the following paper. While the experimental results were satisfactorily repro¬ 
ducible, it may be emphasised here that the present arrangement with its glass construction 
and irregular annular space may be expected to complicate the already difficult problem of such 
thermal diffusion, more particularly from the aspects of turbulence in convection and of 
correct mean temperature and temperature gradient determination. Further refinement of 
the experimental arrangement is therefore necessary. 

The authors desire to express their thanks to the Carnegie Trustees for the award of a 
Scholarship to one of them (A. C. D.), during the tenure of which part of the above work was 
carried out. 


Summary 

Thermal diffusion in aqueous solutions of raffinose, sucrose, glucose, xylose, glycerol and 
acetone was studied in respect of variation of concentration and of temperature. Initial rates 
of separation were determined as produced by an apparatus consisting of two vertical con¬ 
centric glass tubes of length 130 cm. and mean annular separation o-68 mm. 

With a solution of glucose of 1*7 gm. mol. per litre, the initial rate of separation for the 
extremes of temperature which could be experimentally applied was found to be independent 
of the column length for lengths greater than 100 cm. 



304 Thermal Diffusion in some Aqueous Solutions 

With the exception of acetone, all the solutes concentrated at the column foot. In single¬ 
solute solutions a maximum rate was observed at an intermediate concentration, which in the 
sugar series was at a lower molar concentration the higher the molar weight of the solute. 
With the two-solute solutions sucrose-glucose and sucrose-glycerol little or no relative separation 
was obtained, contrary to expectation based on the single-solute data. 

Initial rates increased with rise in mean temperature and with increase in the applied 
temperature gradient. 


REFERENCES TO LITERATURE 

CLUSIUS, K., and DlCKEL, G., 1938. Naturwiss., XXVI, 546 
Debye, P., 1939. Ann. der Physik , XXXVI, 284. 

De Groot, 1945. ZJEffet Soret (Amsterdam), 38. 

Fulmer, E. J., Hickey, R. J., and Underkofler, L. A., 1940. bid. and E?ig. Chem., Anal. Ed., 
xn, 729. 

Gillespie, L. J., and Breck, S., 1941- Journ. Chem. Phys., IX, 370. 

Korsching, H., and Wirtz, K., 1940. Per., Lxxin, 249. 

Munro, J., 1936. Thesis, Edinburgh University, 94. 

Shaffer, P. A., and Somogyi, M., 1933. Journ. Biol. Chem., C, 695. 

SOMOGYI, M., 1937. Ibid., CXVII, 771. 

WEREIDE, T., 1914. Ann. de Physique, II, 55, 67. 

Wirtz, K., 1943. Naturwiss., XXXI, 416. 


{Issued separately May 12, 1948) 


This paper was assisted in publication by a grant from the Carnegie Trust for the Universities of Scotland. 



( 305 ) 


XXXII.— An Elementary Treatment of Thermal Diffusion in Gaseous and Liquid 
Systems. By Mowbray Ritchie, Chemistry Department, University of 
Edinburgh. (With Three Text-figures.) 

(MS. received March 25, 1946. Read July 1, 1946) 

It has been stated that appreciation of the principles of thermal diffusion in gases is difficult 
without detailed study of the transport equations for actual as opposed to ideal “ kinetic theory” 
molecules. Such difficulty is obviously enhanced for the liquid state where even the simple 
diffusion law in relation to concentration is not fully developed (cf. Fiirth, 1945). It was 
desired to have some simple if very approximate theory by which the main features noted 
in the previous paper might be examined; at the same time, any such theory must in principle 
be capable of reproducing the essentials of thermal diffusion in gaseous systems. 

Thermal diffusion is thus considered for both gases and liquids in an “elementary strip”. 
The expression for liquids is then combined with the convection effect of the diffusion column. 
The general approach to the problem was as follows. 

Consider a volume of a binary gaseous or liquid mixture, represented by the elementary 
strip ABCD (fig. 1), where the faces AB and CD are maintained at fixed temperatures T -AT 
and T +idT. It is experimentally observed that the concentration of one species increases in 


A 


B 

T 


I_I_I C 

-AT T T+AT 


Fig. 1. 

one temperature region and decreases in the other. By simple kinetic theory there is no 
explanation of this phenomenon. A molecule of one species will diffuse from the intermediate 
T region in the direction of, say, the T + AT region at a rate determined by the conditions. 
At the same time the reverse process may be visualised as taking place to an equal extent; 
if the conditions along the path of transfer are the same, no net change in concentration will 
occur. This will apply also to the other type of molecule in the mixture, although the actual 
distance travelled in unit time will be different. No relative concentration gradient is therefore 
to be expected, because, essentially, the conditions encountered in the path between the two 
regions are regarded as the same no matter what the direction of transfer. It has become 
apparent that the general explanation of the observed phenomenon lies in the fact that a 
molecule from the T region on entering the T+AT region does not at once lose the diffusional 
characteristics evident in the T region. The effect may thus be described as a persistence of 
molecular characteristics. If then a T molecule be regarded as “projected” into a T +AT 
region, the reverse “projection” of a T+AT molecule will not exactly restore the original 
situation, because the diffusional characteristics are not now the same throughout the path of 
travel. The same will apply, though to a different degree, to the other type of molecule, and 
a relative concentration change will then be found. 

To determine the direction of concentration change as a function of pressures or concentra¬ 
tions, masses and diameters, a molecule is considered as diffusing from the central T region 
separately into the two regions of higher and lower temperatures; the difference between such 
rates of diffusion for the two kinds of molecule, on the condition that the total pressure must 




306 Mowbray Ritchie , Elementary Treatment of 

remain the same throughout the system, will give a measure of the concentration change. In. 
as much as the gas system must be capable of such treatment to give results in agreement with 
experiment, a gas mixture is first considered. 


Gas System 


The relative rate of diffusion of a molecule A through a gas X at temperature T may be 
taken as Dx/[X], where D* is a diffusion constant containing molecular weight and diameter 
factors only. If the A molecules are of concentration [A], the number of A molecules diffusing 
in a given direction in unit time may be taken as [A]Dx/[X], 

Where the A molecules diffuse through a binary mixture of X x and X 2 molecules, the time 
taken to diffuse a given distance in the mixture may be taken as the sum of the times for the 
constituents considered singly. Thus for one A molecule, 

and hence (i) 

Let the binary gas mixture be represented by two different molecular species of molecular 
weights, Mi, M 2 , diameters o- 2 , at pressures [XJ, [X 2 ] at temperature T. Since the total 
gas pressure must be constant, the corresponding concentrations at T+dT and T — AT will be 

[XJT [XJT [XJT [XJT 

(T+JT)’ (T+JT) (T-JT)’ (T -AT) 

The number of Mi molecules passing from T to T +AT will be proportional to 


Mxy-rxi PML/ D si. JML 

( ^ X i) — |_(T +AT) / D*i + ( T +JT) 


/ l- 1 

■ 




X x X a ' 


This flow must be balanced by a movement of the whole mass of binary gas mixture in the 
reverse direction, where now the relative numbers of M x and M a molecules so returning will 
be determined by the relative concentrations of M x and M 2 , viz. 


(W[XJ/([XJ + [XJ) and (A X x )'[XJ/([XJ + [XJ). 

The number of M x molecules passing from the T region to the T - AT region is similarly 
treated; then, by applying the same considerations to the M 2 molecules and summing the 
various terms, we find the difference in the numbers of M x and M 2 molecules passing from the 
colder to the hotter region, i.e. the separation, to be 


4[X x j[XJ 

([Xj+[xj) 


T WXA “ • u x I x J l- 


(*) 


If the D terms, or their difference, be regarded as temperature independent over a short 
temperature range, we have on integration, 


Sec 


[XJ[X 2 1 - 

([XJ + [XJ)^*A 


_ T. 


( 3 ) 


The diffusion coefficient D* maybe put as proportional to (x/M A + i/M. x ) i crl x when M A and M x 
are the respective molecular weights and is the sum of their radii. Thus by equation (i), 


D * - r 4i[Xx] cim r 1 


( 3 «) 


with a corresponding term for , 

The expression (3) is then in accordance with the essential features of gaseous thermal 
diffusion as follows:— 


(1) The logarithmic dependence on temperature is in agreement with the expressions 
derived by Chapman and Cowling (1939) and by Fiirth (1942). 



Thermal Diffusion in Gaseous and Liquid Systems 307 

(2) No separation is of course obtained when [XJ or [XJ is zero, and there must be one 
value of [XJ and [X 2 ], for which the separation is a maximum. This will not necessarily 
occur at [XJ = [X 2 ], because there are concentration factors in the diffusion terms also; the 
position of the maximum will be determined by the relative masses and diameters. For a 
given binary mixture the initial separation will, however, be independent of the total pressure 
for given relative values of [XJ and [X 2 ]. 

(3) If the diameters of the gas molecules are equal, the sign of the separation will be 
determined by the relative molecular weights. The expression is positive if M x < M 2 , i.e. the 
lighter molecules will accumulate in the hot region. If the masses are equal, the molecules of 
smaller diameter will similarly accumulate in the hot region. 

(4) A numerical illustration of the application of the above expression may be given for 
the case of two isotopes, when the masses are different but the diameters and any force law 
correction may be taken as the same for the two species. The numerical value of the thermal 
diffusion factor will depend on the effective diameter, but the ratio h r of this factor to the 
coefficient D of ordinary diffusion will not contain a diameter term and may thus be compared 
with the value derived from more rigorous theory. 

With [XJ+[XJ = i, and with the ratio of the isotopic masses not far from unity, the 
coefficient D may be taken as D = (i/M x + i/M 2 )fycr 2 ([Xx] + [XJ). The ratio h T then becomes 





( 4 ) 


Furry, Jones and Onsager (1939) quote Enskog’s result for this particular case as 




105 (M a - Mi) 
18 (M a + Mx) 


txjpy, 


and Chapman’s earlier expression as 


. xy (M,-M 1 ) mx 2 ] 

T 3 (M 2 + Mj) (9-I5-8-2SMXJ)’ 


( 5 ) 

( 6 ) 


while experimental determination of h T led Furry, Jones and Onsager to suggest that 

(Ma-MJ 

^ =0 - 3 S (m^mT) [X i][X2] (7) 

be used as a provisional value for the design of apparatus. The following table shows values 
of hy calculated for M x = 100, M 2 = no by each of the above expressions. 


Table I 



( 4 ) 

( 5 ) 

(6) 

( 7 ) 

0*1 

0-0024 

0-0038 

0-0029 

0-0015 

o*S 

0-0060 

0-0106 

0-0095 

0*0042 

0*9 

0-0023 

0*0038 

0-0029 

0*0015 


Formula (4) above may therefore be used to give reasonable values of the ratio for such 
conditions. 

(5) Examination of equation (3) shows that for given diameters and masses the sign of the 
separation is always positive, or negative, or zero, independent of the relative concentration; 
for a particular binary mixture there is no indication of a change of sign as the relative concen¬ 
trations are altered. Grew (1944), however, observed experimentally such a change in mixtures 
of neon and ammonia. Here the respective molecular weights and diameters are approximately 
the same, and the sign of the separation is then very sensitive to small changes in the diameter 
factors. The effect may be simply treated in relation to the general equation (3) on the basis 
that the magnitude of the effective collision diameter of the ammonia molecule is somewhat 
greater in an ammonia-ammonia collision than in an ammonia-neon collision, by virtue of the 



308 


Mowbray Ritchie, An Elementary Treatment of 


“softness” of the ammonia as compared with the neon molecule. If we take certain standard 
values for the collision diameters, equation (30) may be amended by the insertion of a factor 
1 + A to give 

I rvr 1 / . "\\ 2 r"V ~i l” - * 

izb) 


D Xl 

U X 1 X, 


oc [X 1 ](I + [XJcr^ 

00 Ui/Mi + i/Mj)*' (i/Mj. + 1/Mj)*. 


Relative values of S (equation (3)) have been calculated for a constant temperature by taking 

NH 3 = M 1 = i7, Ne = M 2 = 2o, (71.1-2-38, ct 2 . 2 = 2-3o ; 01*-<7*1 = 2-34, 1+ A =1*03; 

it is remarkable that this simple amendment reproduces very satisfactorily not only the correct 
change in sign at a mole fraction of ammonia of 0*25 as in Grew’s experiments, but also the 
numerical relative separations in each region. This is shown in fig. 2, where the continuous 
curve was calculated on the above basis, with a proportionality constant of 5 x io“ 6 , and Grew’s 
experimental points are indicated by O . 

(6) It is obvious that thermal diffusion with actual molecules cannot strictly be dealt with 
on the simple theory resulting in equation (3), but must be largely dependent on the force law 



governing the collisional approach of the molecules. An indication of the effect of such a force 
law may be presented as follows. 

The above simple theory of thermal diffusion is based essentially on the assumption that 
the rate of diffusion on the higher temperature side is greater than that on the lower temperature 
side because of the decrease in molecular concentration at the higher temperature. The rate 
of diffusion is in fact determined by the collision numbers of the diffusing molecule; if the 
number of collisions per unit time is the same in both temperature regions no thermal diffusion 
will occur. The number of collisions per second will be proportional to the effective collision 
area and to the velocity of the molecule, since the pressure remains constant (Glasstone, 1940), 
z\e. Zoc V a 2 . If the repulsive force between the molecules be represented by F = - K/r*, 
where r is the distance between the molecules and K is a constant, then, following Frenkel 
( i 94°)j Vo- 2 may be derived in terms of V by the method of dimensions. Since K/r s 
has the dimensions of force MLT~ 2 , K has dimensions ML S+1 T“ 2 . Now mY 2 is of 

dimensions ML 2 T~ 2 , so that YLjmY 2 is of dimensions L s-1 . The radius a of dim ension L 

1 -2 

must then be proportional to (K/mY 2 ) 3 - 1 or to Y s - 1 for constant m and K. Hence the number 

-i s - 5 

of collisions Ya 2 is proportional to V.V 5 - 1 — V 5 ~ h The velocity V will increase with rise in 
temperature, but if 3* — 5 (Maxwellian molecules), the number of collisions will be proportional 
to Y°, t.e. will be constant and independent of temperature; thermal diffusion will not be 
observed. 

If s> 5, the number of collisions undergone by a low temperature molecule “projected” 



Thermal Diffusion in Gaseous and Liquid Systems 309 

into a high temperature region will tend to be less than the number of collisions undergone by 
a high temperature molecule in the high temperature region; the diffusing molecule will move 
further on the hot side than on the cold. This is the state of affairs corresponding to the 
elementary theory above; high values of s tend to approximate to kinetic theory conditions. 

If s < 5, the number of collisions will be proportional to i/V w where n is positive. Here 
higher temperatures and increased V mean a smaller number of collisions, and the diffusing 
molecule will then move further on the cold side. 


Liquid System 


In deriving an equation for thermal diffusion in the liquid state by the general principles 
above, it is necessary to obtain an expression for the rate of diffusion of a molecule through a 
liquid medium. Glasstone, Laidler and Eyring (1941, p. 525) by the theory of absolute 
reaction rates as applied to viscosity and diffusion, consider an expression which may be put 
in the form D = FW m ; here the factor F contains not only the masses M x and M 2 of solute 
and solvent molecules, but also molecular volume and temperature parameters, and e 0 is the 
potential energy barrier for viscous flow. 

When the diffusing molecule is of the same kind as the solvent molecule, as in self-diffusion, 
the diffusion coefficient above may be reduced (loc. eit., p. 519) to the approximate value 


kT 

D«=—, where d x is the diameter of the diffusing molecule and rj is the viscosity of the medium. 

When a large molecule diffuses in a medium of relatively small molecules, it is regarded as 
probable (loo. cit ., p. 520) that the movement of solvent is the rate-determining process, and the 


coefficient becomes D = — where a is a factor in the neighbourhood of unity. These two 

expressions are to be compared with the Stokes equation for “macroscopic” diffusing particles 

D °c On this basis it is to be expected that in the case of a sugar or glycerol molecule 

in aqueous solution the rate of diffusion would become relatively greater as the concentration 
increases, quite apart from the viscosity factor. For the present purpose we may put 


T ..... 

D cc —— and include in the diameter any variation in rate with the relative sizes of diffusing 

dj'T} 

and solvent molecules. 

An elementary strip is then considered as before (fig. 1). The solute molecules are repre¬ 
sented as of mass M x , diffusional diameter d 1} molecular volume V x at concentration X x , with 
corresponding symbols, M 2 , etc. for the solvent. The number of molecules of type M x passing 


X x T 


from the T region through the T+AT region is then proportional to (AXff~ — , 

~ #1 T?T + AT 

corresponding to a molecule at temperature T passing through a region of viscosity 
This flow of M x molecules must be balanced by a reverse flow of both solute and solvent 
molecules because of the constant pressure conditions. The volume of M x molecules entering 
the T+ 41 T region is (ZlX x )'V x , and since these eventually attain the T +AT equilibrium the 
volume increase will be (AX^VipJp ' x , where p x is the density of M x molecules at T, and p\ is 
the corresponding density at T+ AT. This volume of solution is then to be returned to the 
T region. Gf this volume, the volume of M x molecules present may be taken as 


(v lXl+ Y 2 X 2 )’ 

and therefore the number of M x molecules reaching the T region is 

(A-yr yYl £l V 1 X 1 Pl = ( A X Y _ VlXl _ 

1 ^ Vi p\ (ViXi + V 2 X 2 ) Pl { V (V^ + ViXJ 

Similarly the number of M 2 molecules returned in this process is 

(Jxv Vl VA _ 

{ l> V 2 (V 1 X 1 +V 2 X 2 ) p> 2 



3 IQ 


Mowbray Ritchie , Elementary Treatment of 


Expressions of similar type may then be obtained for an initial flow of M2 molecules, and the 
entire series of operations performed similarly for flow towards the T -AT region.. Examina¬ 
tion of expansions of solutes and solvent over the relatively small temperature region involved 
shows that p x p^lPxP% will not be far from unity, and we may put pip'zjp'ipz = 1 + a with 
p'ip 2 /pip' 2 =i-^; further, if the expansions be taken as proportional to the temperature,.so 
that p/p'~i + a(ZlT), a may be replaced by (cq-o^zJT, where a x and a 2 are the expansion 
coefficients per degree for solute and solvent respectively. By summation of the twelve different 
rates, the separation rate between M* and M 2 molecules in the direction (T — AT) — >■ (T + AT) 
is then found to be AS * A + B, (8) 


where xx fi i\ 

a= (v 1 x 1 +y 2 x 2 ) (Vi+V 2) U _ ^.) (i/,?t+at " i/ ’ 7t ' at) 

and 

B = (a x - a 2) (y iX VgX g)^ l /^ 1 * V 2 / 4 )( i / 7 7 t+at + */at) • 

A proper appreciation of the factors of this expression can only be obtained by adequate 
integration, which in this case adds further complexities. Some idea of the relative importance 
of the A and B terms can, however, be obtained directly. The expansions of liquids over the 
temperature range concerned is generally small, and their differences are smaller still. We 
may, therefore, first consider the implications of the A term alone. The viscosity of a solution 
always decreases with rise in temperature; the sign of the separation will be determined by the 
difference of the diffusion diameter reciprocals. For d x > d 2 the expression is negative; large 
molecules should therefore migrate to the low temperature region and, in the vertical cascade 
system, will tend to collect at the column foot. The separation will naturally be zero when 
X x or X 2 is zero, and a maximum will occur with changing concentration, the position of which, 
however, will be determined by viscosity as well as by the direct concentration values. If the 
difference between the diffusion diameters is small, the effect of the B term will become increas¬ 
ingly important. 

If the diffusion diameter be taken as the cube root of the apparent molecular volume 
(see Table II below, p. 312), then diameters decrease in the order sucrose, glucose, glycerol, 
acetone, water. In all cases, then, (i/d 1 -i/d 2 ) is negative and indicates concentration of 
solute at the column foot. In the case of acetone, (i/d 1 -i/d^ has the smallest numerical 
value, and, other things being equal, we would thus expect the B term to exert its greatest 
influence on the acetone-water system. Further, on the assumption that the expansions of 
solutes and solvent are respectively the same in solution as in the free state, the series of increas¬ 
ing expansions is water, sugar, glycerol, acetone. Here the value of the sugar expansion has 
been estimated from the expansion of concentrated solutions, since normally the expansion 
of a solution lies between that of solvent and solute. In all cases (a x - a 2 ) is positive; where 
this factor is largest and (i/d 1 - i/d 2 ) is smallest, as in acetone-water, there will therefore be a 
decided tendency for AS to become positive also, corresponding to solute concentration at the 
column top. 

The effects of altered concentration and temperature are more complex. In the case of the 
former, increasing solute concentration in the sugar series means increased solution viscosity. 
The temperature change of viscosity for a constant temperature range is here greatest at high 
concentrations; in dilute solutions i/y T _ AT ) in term A is small as compared with 

(i/^t+at + i/^t-at) i n term B, and the effect of the B term will thus be greatest at low concen¬ 
trations. Neglect of the B term will thus give ratios of experimental to calculated values 
which will increase as the concentration of solute increases. Other factors being equal, such 
a drift in ratios will be most pronounced for high V x values in term B. 

Before comparison between calculated and experimental values can be attempted, 
integration of equation (8) must be coupled with conversion from the elementary strip of fig. 1 
to the cascade convection system. Both are complex, and in view of the many assumptions 
already made, integration of the A term alone of equation (8) was carried out. 

With 

AS oc T(i/^ t+at - i/^ T _ AT ) or dS oc Td(i/y), 



Thermal Diffusion in Gaseous and Liquid Systems 311 

the variation of viscosity with temperature may be represented by rj =*P<? Q / T . Then 

S cc (Q/P)J^Q/ t T~^T. 

This integration gives a convergent series which for aqueous solutions (Q — 2000 cal. approx.) 
may be represented by the first term only: thus 

St*=^ / (T 2 /7J2”T 1 /7] 1 ), where k ' does not include Q or P. 

In the cascade convection system, the rate of relative movement of the two streams of liquid 
will be determined largely by the difference in density of the streams and by the viscosity at the 
mean temperature. The following expression is then obtained, increase in concentration at 
the column foot being given the positive sign:— 


S = 


KpT x ~ PtQ Xqjg 


rjT (V 1 X 1 +V 2 X 2 ) 


ihd- 


(9) 


Here T 2 -T 1 is the applied temperature gradient, T is the mean temperature (T 2 4-^/2 
T' a - (T 2 + T)/ 2 and T\ - (T + T x )/2. 

For comparison with this equation, the experimentally observed changes in concentration 
AX x require conversion to S exp values. If X x and X 2 are initial concentrations in grm. mol. 
per litre, and X\ and X' 2 are the final concentrations after the period of thermal diffusion, 

S CT p = (X' x - X'a) - (X x - X 2 ) = (X' x - X x ) -(X' 2 - X 2 ) =AX 1 -AX t . 

But 

+ V 2 X 2 = YiX'i + V 2 X' 2 = iooOj 

and hence 

JX 2 =-Y 1 (JX 1 )/V 2 and S^^ + V^AX^V* 

Variation of Mean Temperature .—When the temperature gradient is constant but the 
mean temperature is varied for a solution of constant concentration, relative rates should then 
be approximately expressed by 

AX^k^T^-T.Mhx, (*°) 

on the assumption that in the temperature range concerned the density of the solution may be 
considered as linearly related to the absolute temperature. The application of this equation 
is shown for 30 per cent, solutions of glucose, sucrose and glycerol by the continuous curves of 
fig. 3 of the previous paper (p. 301). For viscosity in millipoises, the values of the constants k x 
were taken as 0*032 (glucose), 0*019 (sucrose) and 0-026 (glycerol). The agreement between 
theory and experiment is such that in each case the variation in mean temperature may be 
adequately represented by the above equation. 

Variation of Temperature Gradient .—When the mean temperature is constant and the 
temperature gradient alone is varied, the assumption that density varies linearly with tempera¬ 
ture gives 

AX^k 2 ( T 2 - TOOiy^ - TJ^/tjt. (II) 

For the glucose solution in question (30 per cent, glucose) k 2 was taken as 5*5 x io~ 4 , with 
viscosity in millipoises. For the experimental conditions, satisfactory reproducibility of 
observed rates was more difficult than usual, this being connected with the manipulation of the 
heating and cooling streams, but, as shown in fig. 4 of the preceding paper (p. 302), the 
calculated continuous curves are again in fair agreement with experiment. 

It may be here remarked that replacement of the factor by 

causes little difference in calculated relative separation values. The general equation (8) may 
be put in the form 

AS cc 

the fact that the variation of temperature may be represented by the A term alone of 
equation (8) does not require that the B factor is entirely negligible. 


/ bT 


cT 


V7t+at 



312 Mowbray Ritchie , An Elementary Treatment of 

The Sign of the Separation and the Effect of Concentration Change.—'For comparison with 
experiment, relative rates have been calculated by equation (9), omitting the proportionality 
constant, on the assumption that the diffusion diameter d is given by (M//>)*. Since glucose 
and sucrose are solids, the apparent solute density in solution has been obtained from the 
expression 1000 (where X x and X 2 are concentrations in grm. mol. per litre) 

on the assumption that the molecular volume of water remains at 18. The normal molecular 
weights were employed. The resulting values vary very slightly with the concentration of the 
solution used: for the present purposes xo per cent, solutions were considered. Data are shown 
in Table II. 


Table II 


Substance 

M 

V 

d 

Pcalc 

Pobs 

(dx - d 2 ) 


Water 

18 

18 

2*6 

0-998 

0-998 

.. 

. . 

Acetone 

58 

65*6 

4*0 

0-88$ 

0-79 

1-4 

0-135 

Glycerol 

92 

70*2 

4-1 

1*311 

1.26 

1 '5 

0-141 

Glucose 

180 

109 

4*8 

1-65 

1-56 (solid) 

2-2 

0-177 

Sucrose 

342 

206 

4.9 

i-66 

1-59 (solid) 

3*3 

0-181 


For comparison the ordinary values of the densities of the pure substances are given under p oba ; 
for each solute the calculated density is appreciably greater, but was adopted as corresponding 
more closely to the conditions to which the general equation is considered to apply. The 
results of such calculations are given in Table III. 


Table III.—T 2 =373, ^=293 


x. 

x 2 

X x X 2 


(Ti.Ii) 

P40 

Pso 

Pi 0 - Pso 

Scale 

Sexp 

Sexp 

Scale 






Glycerol 






1*16 

51-0 

59-o 

5*79 

89-3 

1-023 

1*017 

0*0060 

0-0675 

0-047 

0*70 

2*41 

46-0 

in 

7*34 

767 

1-049 

1*0425 

0*0065 

0-093 

0-083 

0*89 

3*45 

42*0 

145 

9-63 

64-3 

1-0695 

1-062 

0-0075 

0*090 

0*089 

1-00 

4*59 

37*5 

172 

1175 

54-1 

1*093 

1*085 

0*0080 

0*0785 

0*082 

1*04 

5*93 

32*1 

191 

17-0 

40-1 

1*120 

W105 

0*0095 

0-053 

0-057 

1*10 

8*03 

23-8 

191 

33-6 

20-0 

1-162 

W50 

0-0120 

0-0168 

0-023 

1*38 






Glucose 






o*555 

52-0 

28-9 

S -8i 

95*8 

1*03052 

1-00875 

0*0218 

0-234 

0-033 

0*I4I 

i'ii 

48-6 

54*o 

7-36 

71-6 

1-06775 

1-04544 

0-0223 

0-262 

0*074 

0*28 

1-665 

44-9 

74-8 

io*4 

57*4 

1-10201 

1-07954 

0-0225 

0-210 

0*099 

o*47 

2*22 

4i-5 

92*1 

13*8 

41-6 

1-14481 

1*12211 

0-0227 

0-142 

0-102 

0*72 

2*77 

38-0 

105-5 

19-9 

30-2 

1*17792 

1-15498 

0*0229 

0 *o 82 

0*087 

1*05 






Sucrose 






0*292 

52-0 

15-0 

5-88 

85-8 

1*03050 

1-00868 

0-0218 

0*229 

0*034 

0*15 

0*614 

48*1 

29-6 

7*88 

67*2 

1*07135 

1 -04909 

0-0223 

0-270 

0*082 

0*30 

0-876 

44*9 

39*4 

10-9 

52*8 

1*10465 

1-08230 

0-0224 

0-262 

0-091 

o*34 

1-170 

41-4 

48-4 

15*7 

41*1 

1-14165 

1-11910 

0-0226 

0-I38 

o*o8r 

0-58 

1*460 

37*8 

55*2 

22-2 

27-0 

1-17800 

1-15510 

0-0231 

0-075 

0-065 

o*88 






Acetone 






2-74 

45*6 

125 

12-0 

29-2 

0*9781 

0*9676 

0*0102 

0-035 

-*0-043 

-i*3 

5*48 

35*5 

194 

13*7 

26*0 

0-9527 

0-9387 

0*0140 

0*058 

-0-216 

“3*7 

S -23 

24-9 

205 

12-2 

29-4 

0*9186 

0*8990 

0*0196 

0*109 

-0*412 

“3*7 

10-96 

13*3 

146 

7*8 

35*o 

0*8672 

0*8472 

0*0200 

0*148 

“0-341 

“2*3 


In acetone, the mean temp, was taken as 30° C., with T x = xo° C., T 2 - 50° C.; values of p are 
given for 18 0 and 38° C. All values of viscosities and densities were obtained by means of the 
International Critical Tables, with the exception of glucose solution densities, which were 
calculated on the basis that the increase in density for concentrations in grm. per 100 c.c. is the 
same for glucose as for sucrose. 




Thermal Diffusion in Gaseous and Liquid Systems 313 

As shown by the inconstancy of the ratios of the final columns, equation (9) does not 
rigidly account for the variation of separation rate with changing concentration, and it is 
therefore to be concluded that the B term of equation (8) cannot be here neglected. The sign 
of the separation in glycerol, glucose and sucrose is, however, correct, and the drift of the ratios 
is in agreement with the general expectations previously discussed. There is, indeed, some 
indication of an approach to the same ratio value at high concentrations. With these solutes 
also the calculated values show maxima which are not far removed from the experimental 
values. This is shown in fig. 3, where the discontinuous curves show the calculated values for 
glycerol and sucrose, the values of the proportionality constant k having been chosen in each 
case to make the maximum calculated and experimental rates equal. In these cases the 
calculated maxima occur at concentrations less than that observed experimentally; the 
converse is true for acetone. 

As already indicated, the small value of (i/^ 2 - i/d^) and the large coefficient of expansion 
of acetone tend to change the sign of the calculated separation. It is further probable that 
(1/^2 - I /^i)? as already calculated for acetone-water, is subject to a considerable error, correc¬ 
tion of which would favour the same tendency. The other solutes are all akin to water in the 
occurrence of hydroxyl groups, with consequent hydroxyl bonding in aqueous solution, and 



Fig. 3. 

therefore their relative diffusion diameters as estimated from the molecular volume may not be 
far wrong. A relative diameter calculated in the same way for acetone, which must be taken 
as having little bonding with water, may be expected to be sensibly in error; 1 /d 2 -1 \d x will 
be even smaller than the previously calculated value. 

With changing concentration of acetone the viscosity factors do not alter widely; the 
relative constancy of the final ratios of the acetone table above is therefore not unexpected. 

The experimental results of the preceding paper have shown that appreciable separation of 
sugar solutes in aqueous solution by thermal diffusion is not easily achieved. For solutes of 
d x value much greater than that of water, i/d x will be negligible, and examination of the A 
term of the general equation (9) shows that the rate A X x for a given X x concentration will then 
be proportional to V 2 , i.e. become constant, more particularly when viscosity and density 
values are similar for the different sugar solutions. In the same way the contribution to the 
separation AX x by the B term, already not predominant, will not be much altered by the 
altering Y x and d x values; only a small change in relative rate is to be expected, this being 
again in agreement with experiment. 

It is thus to be generally concluded that in thermal diffusion in a binary liquid mixture, the 
substance with the larger diffusion diameter will tend to collect in the cold region if the expan¬ 
sions of the two liquids are equal: if the diffusion diameters are equal, the liquid with the 
higher expansion will collect in the hot region. 

Korsching and Wirtz (1940) carried out an extensive series of thermal diffusion experiments 
involving organic compounds, where the difference in diameters may be expected to be small. In 



314 Mowbray Ritchie, An Elementary Treatment of 

the systems for which density data are available, viz. water-methyl alcohol, water-ethyl alcohol, 
water-butyl alcohol, benzene-thiophene, benzene-cyclohexane, cyclohexane-carbon tetra¬ 
chloride, cyclohexane-72-hexane, chlorobenzene-toluene, ?z-hexane- 7 z-octane, ^-hexane-carbon 
tetrachloride, all with one exception conform to the rule that the compound of higher expansion 
migrates to the top of the diffusion column. The exception is cyclohexane-carbon tetra¬ 
chloride, where cyclohexane accumulated at the column top. Here, however, density data at 
io° and 30° indicate only a very slightly higher expansion for the tetrachloride (1*0246 as against 
1-0241); the expansions are so similar that the diameter factor is presumably predominant. 
Of considerable interest is the mixture benzene-cyclohexane; here the densities are appreciably 
different (0-889 and 0-788 respectively at io°), but the expansions between io° and 30° are again 
very similar (1-0245 an ^ 1*0241). In this case Korsching and Wirtz recorded no separation. 

The general conclusions here reached will naturally require modification if other factors 
become predominant, e.g. activity coefficients, osmotic pressure effects, change of effective 
diffusional diameter with changing concentration. It is further to be emphasised that the 
general theory and equations above have been built on the assumption of a diffusion equation, 
which can only be regarded as correct for molecules which are of large size as compared with 
the molecules of the medium in which they diffuse. For such u macroscopic” molecules, the d 
value will be as determined from the apparent molecular volume. In a solution when the 
diffusing molecule is of the same dimensions as the solvent molecule, this is not necessarily true, 
because a certain free space must be envisaged as surrounding the more or less incompressible 
molecule. This free space, which is again connected with the density in that for incompressible 
molecules of approximately the same volume the reciprocal of the density is some measure of 
the free space, will be determined by the attraction between the molecules. The ordinary 
diffusion coefficient in solution is a complex function which involves not only the diameters and 
masses of diffusing and solvent molecules, but also the free space in the liquid; in as much as 
free space, density and expansion coefficients all depend on molecular attraction, it is to be 
concluded that internal pressures must be considered in the further study of thermal diffusion 
in solution. 

The author expresses his thanks to Dr A. C. Docherty for valuable discussion and help 
in the preparation of the above paper, and to Imperial Chemical Industries Ltd. for a 
Fellowship during the tenure of which part of the work was carried out. 


Summary 

An elementary theory of thermal diffusion applicable to gaseous and liquid systems has 
been developed. This is based on the difference of diffusional characteristics of a molecule 
considered as diffusing through two different temperature regions, when the pressure is constant 
throughout. 

For gaseous systems, the resultant expression is shown to be in general accordance with 
experimental variation of temperature, mass, and diameter factors, and is further developed to 
include isotopic separation, change of sign of separation with concentration, and general force 
law considerations. 

A similar approach to thermal diffusion in solution, combined with the convection effect 
of a “cascade” system, gives an expression which is in general agreement with the results of 
experimental variation of mean temperature and temperature gradient for aqueous solutions 
of sucrose, glucose and glycerol. The simple expression does not account rigidly for the sign 
of separation or the effect of altered concentrations. These discrepancies are discussed in 
relation to the general formula; it is concluded that in addition to the diffusion diameters, the 
relative thermal expansions of solute and solvent are of importance in this connection. 



Thermal Diffusion in Gaseous and Liquid Systems 


3*5 


REFERENCES TO LITERATURE 

Chapman, S., and Cowling, T. G., 1939. Mathematical Theory of Non- Uniform Gases , Cambridge, 
Frenkel, J., 1940. Phys. Rev., lvii, 661. 

Furry, W. H., Jones, R. C., and Onsager, L., 1939. Phys. Rev., lv, 1083. 

FttRTH, R., 1942. Proc. Roy. Soc., A, clxxix, 461. 

-, 1945. Journ. Sci. Instr., xxil, 61. 

GLASSTONE, S., 1940. Textbook of Physical Chemistry (Macmillan), 269. 

GLASSTONE, S., Laidler, K. J., and Eyring, H., 1941. The Theory of Rate Processes, McGraw-Hill. 
Grew, K. E., 1944- Phil Mag. (VII), xxxv, 30. 

KORSCHING, H., and WlRTZ, K., 1940. Ber. dtsch. chem. Ges ., LXXlll, 249. 


{Issued separately May 12, 1948) 


This paper was assisted in publication by a grant from the Carnegie Trust for the Universities of Scotland. 



( 3*6 ) 


XXXIII.— Applications of Elliptic Functions to Wind Tunnel Interference. 

By L. M. Milne-Thomson 

(MS. received August 29, 1946. Read Februaiy 3, 1947 *) 

It is well known that the problem of wind tunnel interference with an aerofoil is reducible to 
the two-dimensional problem of determining the additional upwash velocity v due to the 
presence of the tunnel boundaries. The local corrections to incidence and to drag coefficient 
are then e and ^Cl where e — vjV, V being the wind speed and Cl the lift coefficient (Glauert, 
1930; Pistolesi, 1932). 

Consider a wind tunnel whose cross-section is a given curve in the 2-plane. Let the region 
interior to this curve be mapped on the region interior to a circle of unit radius in the Z-plane 
by the mapping function 2=/(Z). With vortices of strengths k at z x and -k at z 2 there will 
correspond vortices k at Z 1 and -k at Z 2 , and the complex potential w in the Z-plane is 
therefore (Milne-Thomson, 1938) 

w=ik log (Z - Z x ) - ik log (Z - Z 2 ) =F (ik log (Z - i/ZO - ik log (Z -1 /Z 2 )). (1) 

Here the upper sign is taken for a closed working section for which the stream function is 
constant on the circle, and the lower sign for an open working section for which the velocity 
potential is constant on the circle. The bar denotes the conjugate complex quantity. 

If w f denotes the complex potential when the tunnel boundary is absent, we can write 
w=w$ + w j, so that wz denotes the interference potential. Since 

wz = ik log (z - Zj) - ik log (2 - 2 2 ), 
we have the interference velocity given by 


u+iv — ^ 


dz dZ dz dz 


><Z)(( z-z, z-z} T (z-i/z 1 z-i/z}) 



(2) 


This gives the interference velocity at the point 2 due to the vortex pair k at z 1} -k at 2 2 . The 
interference due to a continuous distribution of vorticity along the span can be obtained by 
integration. 

For simplicity we shall consider only those cases in which the span B of the aerofoil lies on 
the real axis in the 2-plane with its centre at the origin, and in which the points of the span 
map into points of the real axis in the Z-plane, the two origins corresponding. We can then 
write 

* 2 --te Zi-JX, Z 2 =-|X (3) 

all real quantities. For elliptic loading we put 4 7rk= — G d\i(i — x 2 /B 2 ) and integrate across 
the span from # = 0 to x = B, G being the circulation round the centre section of the aerofoil. 
The last term of (2) will contribute the downwash due to elliptic loading, namely 2G/2B, so 
that (2) now gives, since u = o, 

4 ^_ 27 t_ i f B // 1_1 _\ / 1 1 Y\ xdx 

G B /'(Z)Jo\\Z-X/2 Z + X/a/ \Z-a/X Z + 2 /X//BV(i-^ 2 /B 2 )" 

Note that the variables are X and x where £*=/(|X). 


(4) 



Applications of Elliptic Functions to Wind Tunnel Interference 317 

As a particular application of the general formula (4) consider the elliptical cross-section 
of major axis 2 a and minor axis 2 b. 

The interior of the ellipse is mapped on the interior of the unit circle by 

z = c sxa.pt, Z = mhn t, c i = a 2 -b‘ i , ^ = tt/ 2K, a = f cosh i/K', b=c sinhJ/K', (s) 

where m is the squared modulus and K, K' are the real and imaginary quarter periods of the 
elliptic functions. To see this, observe that the elliptic functions sn (|/K' +/), sn ( - |z‘K' + 1 ) 
are such that a simple pole of one is a simple zero of the other, and therefore their product is an 
elliptic function devoid of poles and is consequently a constant, in fact, 

sn (j* K' + /) sn (- \i¥J +/) = m~%. (6) 

Thus if t is real, the two functions are conjugate imaginaries, and therefore 

| sn (JzK' + 1 ) | = mrl. 

Thus as / describes the straight line from J/K' + K to |/K' - K in the 2f-plane, we see from (5) 
that Z describes a semicircle of unit radius in the Z-plane, and that jst describes a semi-ellipse 
in the #-plane, whence the correspondence is easily established (Neville, 1944). 

Thus with the span of the aerofoil along and centred on the major axis we may write: 


Then (4) gives 


where 


\x — c sin pt = JB sin pr , JX = j^sn t,' 

z = csinpT , Z — mhnT. 



BcnTdnTf K 
2 TTC COS pTj 0 


(A (OT/sW) sin prdr. 


/i09 = 


I 

sn T - sn t 


1 

snT + sn ? 


/ 2 (/)=/i(/ + fK'), 


( 7 ) 

( 8 ) 

( 9 ) 


the last relation being obtained by the use of (6). 

When the span of the aerofoil coincides with the distance between the foci, (8) can be 
integrated in compact form. In this case B = 2 c and r=t. If we integrate {fft) =F fft)) cos pt 
round the rectangle whose corners are ±K±zK', indented to exclude the poles ±T±zK', 
and observe that fft ± iK f ) = fft), f 2 (t ± iK f ) = f x (t), we find that the right-hand member of (8) 
becomes 

±i“COsh/K' _ b ^ 

sinh/K' a l b 

This shows that v is constant across the span. Also 7 tBG = 2SVCl, where S is the plan area 
of the aerofoil, and so if C =7rab, the area of the tunnel section, we get 

SC L b SC L a 

g —_if . qj- _„_ 

4C a-\~b 4^ cl -\-b 

according as the working section is closed or open, results obtained by Glauert (Glauert, 1932) 
by a different method. 

Rosenhead (Rosenhead, 1933) has discussed the elliptic and rectangular section with the 
aid of theta functions. With the present method the rectangular section is treated by using 
the mapping 

Z = sc \pz 6 n \pz, 

K = K 
a h 2 


where a is the breadth and h is the height. 



3i 8 Applications of Elliptic Functions to Wind Tunnel Interference 

Summary 

A general formula is obtained for the interference velocity when an aerofoil with elliptically 
distributed circulation is in a closed or open wind tunnel of any cross-section. The mapping 
of the section on the interior of a circle is given in terms of the Jacobian elliptic functions 
appropriate to the ellipse and rectangle. The result is worked out for an aerofoil which spans 
the focal distance in a tunnel whose section is an ellipse. 


REFERENCES TO LITERATURE 

Glauert, H., 1930. Elements of Aerofoil and Airscrew Theory, Cambridge, 190, 191. 

-, 1932. “Wind Tunnel Interference on Aerofoils”, R. M., No. 1470, 11. 

MlLKE-THOMSON, L. M., 1938. Theoretical Hydrodynamics > London, 334. 

-, 1940. “ Hydrodynamical Images 55 , Proc. Camb. Phil. Soc., XXXVI, 246. 

NEVILLE, E. H., 1944. Jacobian Elliptic Functions, Oxford, 315. 

PlSTOLESl, E., 1932. Aerodinamica , Turin, 318. 

Rosenhead, L., 1933. “The Aerofoil in a Wind Tunnel of Elliptic Section 55 , Proc . Roy. Soc., A, 
CXL, 579-604. 

-> 1933 • “Interference due to Walls of a Wind Tunnel ”, Proc. Roy. Soc., A, CXLii, 308-320. 


(Issued separately May 12, 1948) 



( 319 ) 


XXXIV.— Foundations of Relativity: Parts I and II. By A. G. Walker, D.Sc., 
Department of Mathematics, University of Sheffield. Communicated by 
Sir Edmund Whittaker, F.R.S. (With One Text-figure.) 

(MS. received June 8 , 1945* Revised MS. received February 7 > 1946. Read November 5, 1945) 


, Part I 

1. Introduction 

Relativity is the study of matter in motion, and the basis of a theory of relativity can be 
either physical, mathematical, or logical. It is physical if some of the elementary objects 
and relations are concepts derived from the external world and if certain of their properties 
are assumed as physically obvious. If, however, the elementary objects, etc. are defined as 
mathematical symbols and relations, and if the subsequent theorems are mathematical deduc¬ 
tions from these definitions, then the theory may be described as mathematical. Lastly, the 
basis of a theory is logical if certain terms are undefined—and clearly stated as such—and if 
the theory is then developed strictly deductively from an explicit set of axioms and definitions. 
Analogous examples taken from geometry are the Euclidean, algebraic, and projective theories. 
The first, as developed by Euclid, has a physical basis, while the second is mathematical, a 
point being defined as an ordered set of numbers (co-ordinates) and a line as the class of points 
satisfying a linear equation. The third is logical, the undefined elements being point and line 
(an undefined class of points) and the axioms being those of incidence, extension, etc. Usually 
a physical theory comes first, to be followed by a mathematical and then by a logical theory, 
this last being so constructed that it includes previous theories when its undefined elements are 
replaced on the one hand by the conceptual physical objects and on the other hand by the 
symbolic mathematical objects. The construction of such a logical theory is not merely a 
matter of academic interest, for it can be regarded as an analysis of the previous theories. It 
tests, for example, the consistency and independence of their basic assumptions and definitions. 
It also indicates how a theory can be modified, with as little change as possible, so as to include 
some feature previously excluded. This can be particularly useful in the case of a physical 
theory which has been constructed to correspond as closely as possible to the external world, 
for such a theory may need continual modification to keep in step with observational data. 
For this reason the axioms of a logical theory should be not only consistent and independent 
but also simple, i.e. indivisible. 

When we examine existing theories of relativity from this point of view, we see that the 
Newtonian theory is physical and that General Relativity is mathematical. Kinematical 
Relativity, although more recent than General Relativity, is again physical, for it starts with 
the physical concepts of temporal order, particle, light signal, and collinearity of particles 
and with certain assumed relations between them, particularly between light signals and 
temporal order. This physical basis is, however, much smaller than in previous theories, and 
the task of constructing a corresponding logical theory f is considerably simplified by this 
fact and by the novel but powerful deductive arguments initiated by E. A. Milne and con¬ 
tinued by G. J. Whitrow. One contribution which this theory makes in line with modem ideas 
is the demonstration of the fact that it is sufficient to consider only observable, or “ knowable ”, 
quantities, i.e. those given by definable experiments which may be either practical or ideal 
when applied to the external world. This idea can be used with advantage in a logical theory, 
as will be seen later. 

t A paper by G. C. McVittie (1942), entitled “ Axiomatic Treatment of Kinematical Relativity”, is not, as the 
name implies, the statement of a logical theory in our sense. An examination of the “ axioms ” shows it to be 
physico-mathematical. These axioms were subsequently criticised by E. A. Milne and by G. J. Whitrow. 



320 A. G. Walker 

In the present series of papers a theory of cosmology is developed deductively from a logical 
basis consisting of four undefined elements and relations (i instant , particle-class of instants , 
temporal order y and signal correspondence ) and a number of axioms. As in Kinematical 
Relativity, the present theory is expressed entirely in terms of observables. An additional 
feature, however, is the supposition that there is only one “ observer ”, and that all axioms , 
definitions , and “ experiments ” must be expressed in terms of primitive observations made by 
this observer. The axioms are classified as temporal (i), signal (4), an & fundamental (4), 
this last number being relevant only to Part II since it will be increased in a later part. Part I 
is devoted to the study of temporal order, and the consequences of the temporal and signal 
axioms as applied to particles in general and to properly defined collinear sets of particles in 
particular. By the end of Part I, with the addition of an axiom which is discarded later as 
redundant, we are in possession of a body of theorems which corresponds exactly to the physical 
basis of Kinematical Relativity. It appears, therefore, that this basis, when analysed, requires 
six axioms to make it logically sound. 

The temporal and signal axioms are also consistent with General Relativity except, in 
certain cases, axiom S .4. The real trouble arises from axiom S . 1, which is satisfied without 
question in space-times with zero or negative spatial curvature, but tends to exclude the 
multiple reception of a light signal which is a feature of space-time with positive curvature. 
This difficulty can be overcome by allowing the axiom to refer only to the “first arrival”, 
but then axiom S. 4 would be false as it stands and would need drastic modification to make it 
satisfactory in all cases. There is no doubt that a logical system which is consistent with all 
space-times of General Relativity is far more complicated than that given here, and for the 
present, therefore, we prefer to examine the more economical system. 

In Part II we postulate and examine certain fundamental particles , which correspond to 
the fundamental particles in Kinematical Relativity and to those particles in General Relativity 
which are at rest relative to the matter near them. We are now beginning to construct a 
framework, or sub-stratum to use Milne’s term, for a model universe, and certain features 
which we wish this model to possess are contained in the fundamental axioms. Since it is at 
first necessary to examine collinear sets of fundamental particles, Part II is restricted to these 
and contains only those fundamental axioms which are relevant to such sets. A later paper 
will give the axioms in full and will discuss the three-dimensional sub-stratum as a whole. 
Some of the results obtained in Part II correspond to those obtained by Milne and Whitrow 
for what they call “linear equivalences”, but in our derivation we avoid making several of 
their assumptions. We do not assume, for example, that all fundamental particles are 
equivalent in pairs, or that the functions which arise are differentiable. 

One point which may be of interest is that we meet, for what appears to be the first time, 
a criticism levelled against all previous theories. Previously it has been assumed without 
question that temporal order can be adequately described by the continuum of real numbers, 
i.e. that the ordered set of instants in the history of a particle is ordinally-linear. This is, of 
course, open to criticism in view of the fact that there are known to exist ordered sets of much 
more complicated types. In the present theory we do not make this assumption but start 
with a quite general type of order. It is then proved that temporal order is ordinally-linear 
as a consequence of the fundamental axioms. The non-assumption of ordinal-linearity makes 
it necessary to introduce a functional calculus which can be applied to any ordered set; this 
is done in § 8 of Part I, and the notation is used extensively in Part II. Some knowledge of 
ordered sets in general is assumed in these papers (see, for example, Hobson, 1921, and 
Sierpinski, 1928). 

I wish to acknowledge here the debt which I owe to Professor E. A. Milne, both directly 
and through his published work of the last twelve years, a debt far greater than is indicated 
by specific references to his work. 


2. Undefined Basis 

The fundamental undefined element is an instant , and certain undefined mutually exclusive 
classes of instants are particles. Particles are denoted by A, P, Q, T, etc., and instants of a 
particle are denoted by suffixes attached to the particle symbol. 



Foundations of Relativity: Parts I and II 321 

At first one particle, A, is preferred, this being the particle-observer . Between any two 
instants of A there is^an undefined relation, denoted by the symbol <, so that for distinct 
instants A x , A v , either < A y or A^ < When A x < A y we say that A x is before A y ; when 
A x < A y , then we write A y > A x and say that A y is after A x . The identity relation is A x = A y , 
and the three relations =, <, > are mutually exclusive. 

Another undefined relation is a correspondence between the instants of any two particles. 
This will be called a signal correspondence , and for particles P, Q will be written P A Q, 
equivalent to Q V P. It is not a symmetric relation, for in general P A Q and P V Q are not 
identical. The signal axioms (§ 4) relate signal correspondences to temporal order and give 
rise to the desired properties of “light signals”. 

One of the rules followed in this work is to construct all axioms so that they can be expressed 
in terms of A’s instants. We find later that every other particle is similar in all respects to A, 
so that A finally loses its preferential position. It will be seen that all axioms and definitions 
are primitive , i.e. involve only simple statements of identity or order in A’s instants. 

It may be argued that we are not in agreement with experience in taking our undefined element 
of time to be an instant, and that this element should be a duration, to be pictured as an interval. 
This is certainly true, and we hope later to replace the instants, temporal relations and the temporal 
axiom of the present paper by still more fundamental ideas in closer agreement with experience. 
These will give rise to instants as defined elements, and, except for signal correspondences which will 
then refer to durations, the remainder of the present paper will be valid. 

Added in proof. A satisfactory theory on these lines is given in Walker, 1947 ? instants being 
defined in terms of a partially ordered set of durations. The set of instants arising in this way is found 
to be closed, and the definition of ideal instants in § 3 below is therefore unnecessary. 


3. Temporal Order 

Axiom T . 1. —If A x < A v and A y < A z , then A x < J z . 

We say in this case that L y is between L x , A*, and also A*, A x . 

By this axiom the instants of A form an ordered set. It is well known that there are many 
ordered sets other than those similar f to sets of real numbers in natural order. It will, 
however, appear later, as a consequence of the F-axioms, that the set A is similar to such a 
number set, so that it will be possible to construct a numerical representation of the set. Such 
a representation we call a clock . The theorem which makes this possible will be proved in 
§ 13 of Part II, and will now be given as an axiom with the understanding that it will be 
discarded later as redundant. 


Axiom T. X. —The set A possesses a denumerable subset such that between any two 
instants of A there is a member of the subset. 

This axiom implies that the set A is dense-in-itself, and is the well-known necessary and 
sufficient condition for A to be similar to an everywhere-dense set of numbers in natural order. 

Even with axiom T . X we see that the set A is not necessarily closed, i.e. may not be such 
that every bounded progression taken from A possesses a limit in A. We shall not, however, 
adopt an axiom of closure, for if the set is not already closed, then it can be augmented in a 
definable way so that it becomes closed. 

Let 2 be a bounded ascending progression of instants taken from A, and place in a class £€ 
each instant of A which is before some member of 2, and in a class ^ each instant of A which 
is after all members of 2. Then has no last instant, and has a first instant when and 
only when 2 has a limit in A. When 2 has no such limit, we define an ideal instant as the 
pair of classes {£$, M). Other ideal instants are defined jn the same way by ascending and 
descending progressions which do not possess limits in A, with the understanding that all 
related progressions define the same instant. If A has no first instant or no last instant, then 
other ideal instants are defined by (O, A) and (A, O ), where O is the null class. 


t The word “similar” applied to two ordered sets has the definite meaning that there is a one-one 
correspondence between their elements which preserves the order. 



322 A. G. Walker 

A set A' is now defined as consisting of all the ordinary instants of i together with all the 
ideal instants arising out of 5.. This new set is ordered as follows:— 

(i) Order in A is preserved. 

(ii) The ideal instant 3 ) is before or after A x according as A. x belongs to 3 or £ 4 . 

(iii) The ideal instant 3 ) is before the ideal instant (<$£', 3 f ) if 3 and <$£' have an 

instant in common. 

It follows at once that axiom T. i holds for all instants of A'. 

We have now augmented it to form a well-defined ordered set. A' which is closed, every 
progression in A' having a limit in the set. Also A' has a first instant (A' F ) and a last instant 
(A^). The set which remains when these extreme instants are excluded will be denoted by A. 
If axiom T. X is assumed, then it follows that A is similar to an open interval of the arithmetic 
continuum. 

This augmentation is a sound logical process. It is also significant physically when our theory 
is compared with a physical theory, a particular event, such as a collison between particles, being 
perfectly well described in time by a succession of events leading up to the main event. 


4. Signal Axioms 

Axiom S. 1.— The signal correspondence between the ordinary instants of any two particles 
is one-one. 

From this axiom it follows that order can be induced in the set of instants comprising any 
particle P by means of the correspondence A A P. Axiom T. 1 is satisfied by these ordered 
instants, and P can be augmented by the addition of ideal instants, sets P', P being defined as 
for A', A. Since P is similar to A, it is easily seen to follow that P', P are similar to A', A 
respectively. The implied correspondence between ideal instants is unique because corre¬ 
sponding ideal instants are limits of corresponding progressions, and it follows that axiom S. 1 
now holds for ideal as well as ordinary instants. 

If instants P'3, Q' x correspond in P' A Q', we write P'^ A Q' w . 

Axiom S. 2.— Jf P 9 Q, R are any three particles , P x is any instant of P, and P v is given 
by the signal chain P x A Q x A R x V P v> then P v > P x . 

Axiom S. 3. — P, Q are_any two particles and P X) P v any two instants of P such that 
Px ^ Py If 8*x A Qx cind P y A then Q x < Q y . 

There is an axiom S.4, but this will be left until § 6. It should be noted that although 
for convenience we do not now refer back to A, yet it is always possible to do so. Thus axiom 
S.2 could be written 

A* A f x A Q x A V P y V A y 3 & y > 1*. 

Strictly, there are other axioms covering the cases when two or more of the particles in the 
above axioms are identical. It is convenient, however, to cover these cases by allowing 
particles to be_identical, with the convention that if Q, P are identical and = P^, then 
Qcc_A P^and V P a . It follows from axiom S. 2 that if P a A Q^, then P# A P^, and that 
if P3 V Qaj, then P^ V P a , from which we see, as we would expect, that Q and P are always 
interchangeable. 

From the definition of ideal instants and the extension of the signal correspondence to 
include them, it can easily be verified that axioms S. 2 and S. 3 hold for the augmented particle 
sets . 

It will be remembered that the instants of any P' are ordered in accordance with the 
correspondence A' A P', and the extension of axiom S. 3 to P', Q' now shows that every signal 
correspondence preserves order We could now say that A has lost its preferential position. 
In the language of relativity, A is still the only “real” particle-observer, but to every other 
particle has been attached a “subordinate observer”, and the structure is unaltered if we take 
one of these observers to be “real” instead of A. 



Foundations of Relativity; Parts I and II 


323 


5. Signal Theorems 

The extreme instants of the augmented particle sets are special, and we are more often 
concerned with interior instants, i.e. those of the sets A , P, Q , etc. That these never correspond 
to extreme instants is clear from the following theorem, which is an immediate consequence of 
the extended signal axioms. 

Theorem 5 -*- If H f? I*w Q.& Ql are the first and last instants of P\ Q' respectively , 
then P F A Q' F , Q' v A P F) P' L A Q'„ Q' L A P’ v 

From a continued application of axioms S. 2 and S. 3 we getf 

Theorem 5.2.— P, Q, R, . . .,T are any particles , and P x any instant of P. If 
P x A Q x A R x ... A T x and P x A T y , then T y < T x . 

Corollary. —If P x A Q x A i\, then P x < (4/. theorem 5.5). 

Definition.— Particles P, Q coincide at instants P B , if R x A Qa, and P*. V Q^. This 
is written Pa, X Qa- 

From theorem 5.1, all particles coincide at their first instants and again at their last 
instants . 

To prove that coincidence is transitive, suppose that Pa, X Qa: and P^ X R*, and that 
Qa, A Ry. Then from theorem 5.2, 

P® A Qa, A R y and V x A R x OR x < Ry, 
and 

Qa, A Pa, A R* and Qa, A Ry O Ry < R*. 

Hence R^Ry and Qa, A Ra,. Similarly R^ A Qa,, and we have: 

Theorem 5 - 3 —If-P* X Q x andP x X R m then Q x X R x - 

We can also prove the following:— 

Theorem 5.4 .—If P x x Qx and R x A P x A R y3 then R x A Q x A R y . For suppose that 
R* A Qa, A R w . Then R*, A Pa, A Qa, V R* and R z A Qa, A Pa, V Ra,, and from the extended 
axiom S. 2, Ra, < R 2 and R z < Ra,. Hence R s = R^, and similarly R tt = R 1/ . 

From the definition of coincidence and the corollary to theorem 5.2, we have immediately, 

Theorem 5.5 .—If P does not coincide with Q at P x and if P x A Q x A P y , then P x < P y . 

We shall say that particles P, Q are distinct in an interval (of P or Q) if there is an instant 
in the interval at which P, Q do not coincide. It will be shown later that if P does not coincide 
with Q at Pa,, then this instant lies in an interval of P at each instant of which P does not 
coincide with Q. 

It is convenient to define an order relation between certain pairs of instants not necessarily 
belonging to the same particle. 

Definition. — P x < Q y if P® A Qa, < Qy provided, in the case of the equality, P is not 
coincident with Q at P x . An equivalent statement is P a < Py A Q y , with the same proviso. 
If R x < Qy, then Qy > R x . 

The following are easily deduced from previous theorems:— 

Theorem 5.6.— (i) If P x < Q y and Q y < R z} then P x < R z . 

(ii) If P x < Q y and Q y X R z > then P x < R z . 

(iii) If P x X Q v and Q y < R Z3 then P x < R z . 

These theorems also hold when two of the three particles are identical. 

6. Optical Lines 

Definition. —A number of instants are in optical line if they belong one to each of a 
number of particles and if every instant is related by a signal correspondence to every other 
instant. Thus if P* A Q*, Q* A R*, and P x A R a , then P x , Q x , R x are in optical line. 

This corresponds physically to the primitive observation of particles “in line of sight”. 

+ Space-time diagrams, such as those described systematically in § 9, can be conveniently used to suggest 
formal proofs of this and later theorems. 




324 


A. G. Walker 


From theorem 5.4 we see that if P* is in an optical line and if P a X Q®, then Q* is in the 
same optical line. From the definition and properties of general order given in § 5, it follows 
that the instants of an optical line are ordered except for coincidences . 

Axiom S.4. —If P does not coincide with Q at P x , and if P X Q X R X and P X Q X S X are optical 
lines, then one at least of PJR X S X and Q X R X S X is an optical line. 

By considering all the various possible orders involved in this axiom, it can be deduced 
that both ? X R X S X and Q X R X S X are optical lines, and that P B , Q x , R*, S a are in the same optical 
line in some order. Extending to any number of instants we have: 

Theorem 6.1. —All instants collinear optically with two non-coincident instants are in 
one optical line . This line is given in the same way by any two non-coincident instants 
of the set. 

7. Collinear Particles 

Having defined optical lines, we can now define a linear set of particles, the idea being that 
such particles are at all times “ traversed ” by optical lines in both “ directions Also we wish 
to admit the possibility that although the particles are ordered along each optical line, this 
order can change from one such line to another, i.e. “in time 57 . 

Definition. — A set of particles, which includes T, is linear if through each instant of T 
there are two distinct optical lines each of which contains an instant from every particle of the 
set. It is clear from previous theorems that T can be any particle of the set. 

From theorem 6.1 we have immediately: 

Theorem 7.1. —If particles P, Q are distinct in every interval of P (and Q), then all 
particles collinear with P, Qform a linear set which includes P and Q. The set is determined 
in the same way by any two distinct particles belonging to it. 

It is clear that order in relation to particles of a linear set is one of 4 ‘betweenness’ 7 at an 
instant. The essential relation, resulting from order along optical lines,. follows from the 
fact that: 

Theorem 7.2. —If P, T, Q are collinear , and T x is between instants of P, Q on one optical 
line, then T x is between instants of P, Q on the other optical line through T x . In this case we 
say that T is between P, Q at the instant T x . 

Extending to any number of particles, we have: 

Theorem 7.3. —For each instant T x of a particle T belonging to a linear set , every particle 
of the set can be placed in one of three classes c $, 01 , IS, those in c <@ being coincident with T at T x 
and the others being such that T at T x is between any member of M and any member of IS but 
not between any two members of 01 or of SB. 

The classes M and SB are the two sides of T at the instant T^. We shall now state a theorem 
which will be proved in § 8. 

Theorem 7.4. —If two particles are on the same side (opposite sides) of T at one instant 
and on opposite sides {same side) of T at a later instant , then one of them coincides with T at 
some intermediate instant. 

From this theorem we see that in a linear set which includes T, two particles which are on 
the same side or on opposite sides of T at T^ remain so while T^ advances, until one of the 
particles coincides with T. Each side of T thus preserves its identity throughout an interval 
of T which contains no instant at which all the particles of the set coincide. In each such 
interval the two sides of T can conveniently be described as right and left respectively, the 
choice being arbitrary at one instant. 

Through each T x there are two optical lines traversing the set. We shall describe as the 
right optical line that one which contains instants such as P^, where T x A P^ if P is to the right 
of T at T x and T* V P* if P is to the left of T at T x . The other is the left optical line. 

Having defined the two sides of T, we can now define the right and left sides of any other 
particle of the linear set. One way of doing this is to say that Q is to the right of P at P^ 
(and if P^, are on the same right optical line and P^ A Q«. Similarly for particles to 
the left of P. 

From theorem 7.4 it follows that if no two particles of a linear set coincide, then the set 
is permanently ordered, the two sides of each particle remaining unchanged at all instants. 



Foundations of Relativity; Parts I and II 


325 


8. Record Functions 

Suppose that particles T, P are linked by a chain of signal correspondences, and that 
instants T x , P a correspond by this chain. Then it is clear from the signal axioms that P a 
advances continuously (in the general sense applicable to ordered sets) as T x advances over 
the set T. Chains of particular interest are those which start and finish at the same particle, 
say T. If such a chain starts at T* and finishes at T y , then we can conveniently use functional 
notation and write T y =/(T a! ). This/is a symbol denoting a certain kind of correspondence 
between the instants of T, and will be called a record function. As mentioned above, all 
record functions are continuous and monotonic increasing. Examples of chains defining 
record functions are T A P A T, T V P V T, T A P A Qv T, T A P V Q A T. 

If/ g are record functions and T y =/(X,), T z =g(T y ), then we write T z =gf(T a ), and call 
gfthe functionalproduct , or simply product, of/and g. Similarly for any number of functions. 
Also, if T y =f(T x ), then we write T x =/“ 1 (T v ) and call/ -1 the inverse of f. It follows at once 
from our definitions that all products and inverses of record functions are themselves record 
functions. 

Functional powers / n , f~ n of/ are defined in obvious ways and satisfy 

fvf<i —/p+ff ? (f®) q =/ pg 

for all positive and negative integers p, q y f° being the identity function. This identity function 
will be written I, so that I(T X ) - T^. 

Referring to T as in the above definitions, the simplest record function associated with 
another particle P is given by T A P A T; this function will be denoted by P. 

For any instant T 0 at which T does not coincide with P, an ascending progression T n and 
a descending progression T_ w are defined by T^P^Tq) for all integers/. If T w T_ w are 
their respective limits (possibly T^ and T' f ), we can prove 

Theorem 8.1. — T coincides with P at T_^ and again at but at no intermediate instant. 

For if T 0 < T x < T x , then since P is an increasing function we have P(T a ) > P(T 0 )=T 1 . 
This shows that if T^ A P^, then A T x is impossible, and T cannot coincide with P in the 
interval T 0 T X , or at T x . Similarly for each interval T,pT y+1 , so that finally T cannot coincide 
with P at any instant between T_ w and T w . Now if T n A P n , then P n A T w+1 , and in the limit 
we have T w X P w , where P w is the limit of P*. Similarly T.*, X P- w , where P_ w is the limit 
of P_ n V T_ w . The theorem is thus proved. It shows that any instant at which T does not 
coincide with Plies within an interval of T at every instant of which T does not coincide with P. 

We are now in a position to prove theorem 7.4. Consider collinear particles T, P, Q, and 
referring to T, let / denote the function such that/(T a ) is the earlier of P^), Q(T X ), and so 
for each T^. Then it is clear that / is continuous and monotonic increasing, and that if 
/(TJ-T* then T coincides with either P or Q or both at T^. 

Lemma. —If P, Q are on the same side of T at T 0 , then they are on the same side of T 
at every instant between and including T 0 ,f(T 0 ). 

To prove this lemma, we can assume without loss of generality that P is between T, Q, 
or coincides with Q, at P 0 where T 0 A P 0 , and define other instants by P 0 A Q 0 (T 0 A Q 0 ), 
Qi A P 0 A 1 \ (Qjl A Tj), and Q 0 A T 2 . Then from theorem 5.2, 

Po A Qo A T 2 and P 0 A T 3 0 Tj < T 2 , 

so that /(T 0 ) = T 1 , the earlier of T 1? T 2 . Let T* be any instant between T 0 , T x or at T 1? and 
suppose that T is between P, Q at T x . Writing T x A P^, T x V Q x (Q x A P a ), then from 
axiom S.3, 

To A Pq, Tg A Pa* and Tq < Tg. O Po ^ Po» 

Qi A P 0 , Q. A P® and P 0 < P^ O Qi < Qa?> 

and 

Qa? A T XJ Qi A Tj and T^ < T 3 O Q® ^ Qi> 

so that our supposition leads to a contradiction. The lemma is thus proved. 

Returning now to the theorem, suppose that P, Q are on the same side of T at T 0> and 
write T n =f n (T 0 ). Applying the lemma to successive intervals T n T n+1 , it follows that P, Q 



A. G . Walker 


326 

are on the same side of T at every instant between T 0 and T w , where T w is the limit of T n . 
Since /is continuous,/(TJ ==T W , so that T coincides with P or Q at T*>. If, therefore, P, Q 
are on opposite sides of T at some instant after T 0 , then this instant is also after T*,, and the 
theorem is proved. There is a similar argument for the alternative form of the theorem. 

In the case of a linear set of particles which includes T, a modified recordfunction , associated 
with P and denoted by P*, is defined by P^T^-P^), T*., or P -1 ^) according as P is to 
the right of, coincides with, or is to the left of T at T x . In many ways this function is more 
important than direct record functions, for P* not only describes, in a sense, the “apartness” 
of T and P but it also indicates their relative position, since P # (T a ) % T x according as P is to 
the right or left of T. 

Theorem 8.2. — In a linear set of which T is a fixed and P a variable member , let T 0 be 
any instant of T and let P 0 be the instant of P which belongs to the right optical line through 
Pq. Then the order of the instants Pq along this optical line is similar to the order of the 
instants P*(T 0 ) in T. 

To prove this, it is clearly true for particles P, Q on opposite sides of T at T 0 . For 
particles to the right of T at T 0 , suppose that P is between T and Q on the right optical 
line through T 0 . Then the relation f T x < T 2 in the proof of the above lemma is now equivalent 
to P*(T 0 ) < Q*(T 0 ), as required. Similarly for particles to the left of T. Finally, it is clear 
that if P, Q coincide on the right optical line through T 0 , then P*(T 0 ) = Q*(T 0 ). This 
completes the proof. 

We deduce immediately: 

Theorem 8.3. —If T, P, Q are collinear and P m {T^) = Q*(T^), then P coincides with Q 
at the instant of P which is on the right optical line through T 0 . 

The idea of denseness can be applied to a linear set of particles, either in relation to itself 
or in relation to temporal order by means of theorem 8.2. The latter is the more significant, 
and we state the following definition:— 

Definition. —A linear set of particles which includes T is dense at T 0 if for any instant 
T x > T 0 there is a particle P of the set such that T 0 < P(T 0 ) < T x . 

9. Conclusion of Part I 

We have so far considered the general consequences of certain concepts and of the five 
temporal and signal axioms. No geometrical (space) axioms have been introduced, and 
spatial relationships have been expressed in terms of temporal order and signal correspondences. 
In particular we have defined and discussed linear sets of particles, and there has emerged 
the idea of spatial order in such a set, an order which can vary “in time”. One important 
result is that the two “sides” of a particle T are well defined at each instant of T and persist 
in time until all the particles of the linear set coincide; also, if another particle crosses from 
one side of T to the other, then it must coincide with T at some instant. From this and other 
properties we derive the picture, familiar in classical kinematics, of particles moving, in the 
same spatial line. The only differences are that our time and space are not necessarily ordinally- 
linear (i.e. similar to the arithmetic continuum), axiom T. X being necessary for this, and that 
no distance measure has yet been defined for the “interval” between two particles. These 
differences will be removed later. 

From our results it follows that we can construct a convenient space-time diagram for a 
linear set of particles. Referring to rectangular axes in a plane, then the instants of a particle 
are represented by the points of a continuous smooth curve whose tangent always makes 
an angle of less than 45 0 with the jy-axis, temporal advance being given by y increasing. A 
coincidence of two particles is denoted by an intersection of the corresponding curves. Right 
and left optical lines are represented by straight lines parallel to y =x and y— — x respectively. 
* Strictly, this diagrammatic representation implies axiom T. X, but with care it can be used 
without assuming this axiom. Examples of the use of this diagram occur in Part II. 

If axiom T. X is assumed, then, as stated in the Introduction, we have reached precisely 
the starting-point for previous relativity theories, particularly Kinematical Relativity. The 

f The equality T 1 =T 2 is now impossible because P 0 , Q 0 , T x cannot be in optical line. 



Foundations of Relativity: Parts I and II 327 

physical properties of temporal order, light signals, collinearity, etc. which were previously 
assumed as physically obvious have now been deduced as theorems, and it follows that these 
assumptions are logically equivalent to six axioms. This number does not, of course, include 
the explicit principles and hypotheses, such as the Cosmological Principle, upon which these 
others theories are based. In our theory, the axioms which correspond to such principles, 
etc. are the fundamental axioms , some of which are given and studied in Part II. It will 
there be shown that axiom T. X is a consequence of the fundamental axioms. 


Part II 

10. Special Relationships 

Before stating the fundamental axioms, we shall define and discuss certain special relation¬ 
ships between particles which are needed for these axioms. The first is that of symmetry. 

Definition. —Particles P, Q are symmetric about a collinear particle T if they are on 
opposite sides of (or coincident with) T and satisfy P(T a ) - Q(T a ) for all T x . An equivalent 
statement is P*Q*(T (8 )« T a . 

Theorem IO.I .—If particles T, P, Q , R are collinear and if P, Q are symmetric about R, 
then Q*(T X )= R*P*~ X R*(T X ) for all T x . 

Lemma. —Particles T, P, Q are collinear, P x is any instant and P y ~Q*(P^). If P X3 T x 
lie on a right optical line and if P y , T v also lie on a right optical line, then T y =P*~ 1 Q*(T ( ^. 
This function can be regarded as the projection on to the set T of the function Q*(P a; ). 

Let P y , T z lie on a left optical line. Then we find in all cases, T a = P # (T„) and T z = Q*(T a ), 
whence T J/ = P^ 1 (T,) = P*- 1 Q*(T aj ). 

Turning to the theorem, then P, Q are symmetric about R if P*Q*(R X ) = R X for all If, 
therefore, T x , T y , T z are defined to lie on the right optical lines through R x , Q # (RJ, and 
P*Q*(R. X ) respectively, then we are given that T Z = T X . Now by the lemma we have 

T y =R*-iQ*(T,), T z - R*-ip*(T y ), 

whence T !8 =»R s,fi * 1 P # R # ~ 1 Q !ift (T a .) = T a , for symmetry, and the theorem follows. 

If P, Q are symmetric about a collinear particle T, then we can say that each of P, Q is 
the reflection of the other in T. 

Definition. —A linear set of particles is completely symmetric if it contains the reflection 
of every member in every other member. 

The following theorem is seen at once by induction, using the last theorem and the definition 
of modified record functions. 

Theorem 10.2. —If T, P are members of a completely symmetric linear set of particles, 
then for each integer p there is a member whose modified record function is P* P (T^), the 
corresponding unmodified record function being P n (T x ) where n — \p\. 

The next special relationship is that of commutation: 

Definition. —If T, P, Q are collinear particles, then P, Q are commutative with respect 
to T if P*Q*(T a ) - Q*?*(T X ). 

This relationship is symmetric in P and Q, and we shall prove that it is symmetric in all 
three particles, i.e. that: 

Theorem 10.3. —If P, Q are commutative with respect to a collinear particle R, then P, R 
are commutative with respect to Q. 

To prove this, let T be a particle collinear with P, Q, R (it can be identical with one of these). 
Then by the lemma of theorem 10.1, the projection of the function P # Q # (R a ) on to the set T 
is R*"’" 1 P*(T V ) where T 2 , = R # “ 1 Q # (T a ,), i.e . is R^^R^Q^T*), and the projection of 
Q # P # (R aj ) is R^-iQ^R^-ip*^^). Equating these and pre-multiplying by R # , then P, Q are 
commutative with respect to R if 

P#R*-1 Q*^) = Q*R*—ip*( Tas ). ( 1 ) 

Pre- and post-multiplying by P*” 1 and inverting both sides, we get the same equation but with 



328 A. G, Walker 

Q and R interchanged. The theorem is thus proved. It also follows that Q, R are com¬ 
mutative with respect to P. 

Definition. —A linear set of particles is a commutative set if every sub-set of three particles 
satisfies the commutative relation. 

Theorem 10.4. —A linear set of particles, one of which is T y is commutative if T, P, Q 
are commutative for every pair P> Q of the set. 

For referring to T, we have for any three particles P, Q, R, P*Q* = Q*p** } P*R*==R*P* ? 
and Q # R* = R*Q*. From these relations we can deduce (i), and the theorem is proved. 

Lastly there is the fundamental relation ,f which is defined for any three particles but which 
reduces to commutation in the case of collinear particles. This relation for P, Q, R will be 
written P/QR and is symmetric in Q and R. 

Definition. —For any particles P, Q, R and any instant P^, let other instants of P, Q, R 
be defined by the signal chains 

P* A Q* A R* A Q„ V P„, P* A R» A Q z A R 2 V P r 

Then P, Q, R satisfy the relation P/QR if P 2/ = P 3 for every P a . 

Theorem 10.5. —If P, Q> R are collinear particles satisfying the relations PjQR, Q/PR 
and R\PQ, then P, Q, R are commutative. Conversely, if P, Q, R are collinear and com - 
mutative y then the three fundamental relations are satisfied. 




Fig. 1 

Note .—Particle lines are drawn (broken) in (&) but not in (a) since it is not definite whether they 
do or do not intersect in this case. The positions of P u and P„ are also not definite, all that is known 
being that P is between Q and R at these instants. 

To prove this we must deduce Q*R*(P i ) = R*Q*(P*) for every P*. Suppose firstly that 
P^ is an instant at which P is between Q, R. Then except for an interchange of Q and R, 
fig. 1 (a) is the only possible diagram consistent with collinearity and the relation P/QR. In 
hg* 1 ($), for example, is an attempt to construct an alternative diagram assuming that R is 
between P, Q at R a , but it shows that R s V T y is impossible; other alternatives are similarly 

t This relation appears at first sight to be artificial, but it will be seen later to be equivalent to the statement 
that T’s estimates of the “distances” QR and RQ are the same. For collinear particles, this relation is contained 
in Milne’s assumption of equivalence. 




Foundations of Relativity: Parts I and II 329 

ruled out. Constructing now P w P w Q u and R w as in fig. 1 (a), we deduce at once from this 
diagram 

(ii) Q*- P*R*(Q y )-R*P*(Q^, 

(iii) R 2 =P*Q*(RJ = Q*P*(R„). 

If P is between Q, R at P*, then the diagram- applies with F t in place of P u (and possibly Q, 
R interchanged), and (i) gives the desired relation. If P is to the right of both Q and R at P*, 
then the diagram applies with P and Q interchanged (and then possibly also Q and R), and 
with Pf in place of Q^; the desired relation is now given by (ii). If P is to the left of both Q 
and R at P*, then the diagram applies with P and R interchanged (and then possibly also 
Q and R), and with P* in place of R w ; the desired relation is now given by (iii). Finally, 
the cases with coincidences follow from the above by continuity, and the theorem is proved. 
The converse of the theorem is at once seen to be true from the diagrams illustrating 
commutation. 


11. The Fundamental Axioms 

We are now in a position to state the fundamental axioms as they concern linear sets of 
particles. Unlike previous axioms, these do not refer to all particles but only to a special set 
.oT of fundamental particles. The set contains linear sub-sets, and we shall only be con¬ 
cerned with these in the present paper. We shall therefore consider only those axioms or 
their special cases which refer to linear sets. The axioms for the whole set <2^will be given 
in a later paper. 

Axiom F. I, — The fundamental relation is satisfied by every sub-set of three particles 
in 

Axiom F . 2. — If P, Q are any two distinct particles in then the set of particles 

in dffi'which are collinear with P, Q is dense at every instant of some member of 

Axiom F. 3'. — Every linear sub-set of <£Tis completely symmetric . 

Axiom F . 4. — There are at least two distinct particles in 

In a later paper, when we add other axioms which give three spatial dimensions, axiom 
F.3' will be replaced by an axiom F.3 of which it is a special case and which gives what we 
shall call complete symmetry in 

From axiom F.4 we see that there is at least one linear sub-set of Such a sub-set we 
shall write as <2^ and we shall now be concerned only with <2^ a typical linear sub-set 
of e^T From axiom F. 1 and theorem 10.5, we have that is a commutative set. From 
axiom F.2, < 2 ^Tcontains at least one particle, which we shall write as T, such that is 
dense at every instant of T. Later we shall prove that this property holds for every 
particle of <2^ 

Referring to T, let P # , Q # , etc. denote the modified signal functions of particles of 
Then from theorem 10.4, since is a commutative set, every one of these functions 

commutes with every other. It follows that every functional product of any number of these 
functions commutes with every other functional product. 

From axiom F. 3', i s a completely symmetric set, so that if P, R are any two members, 
then there is another member whose modified record function is R^p^-iR*, this being the 
reflection of P in R by theorem 10.1. From the commutative property, this function is 
equivalent to P #_1 R #2 . 

12. Permanent Order in 

Theorem 12.1.— No particle of < 2 ^T distinct from T (the member of mentioned in 
axiom F. 2) coincides with T at any instant. (The first and last instants of the set T / are 
excluded by our notation.) 

Suppose that P is a member of which is distinct from T but coincides with T at one or 



33 o 


A. G. Walker 


more instants. Then there is an interval, say T e T 1 (^T 1 > T 0 ), such that T, P coincide at either 
or both of T 0 , T x but at no instant between. Suppose that T, P coincide at T 0 . Then from 
axioms F. 2 and F.3' there is a member of say Q, which is on the right of T at T 0 and 
is such that T 0 < Q(T 0 ) < T x . Now from the commutative property of we have 

P*Q*(T 0 ) - Q*P*(T 0 ) - Q*(T 0 ), 

since P*(T 0 ) = T 0 is a consequence of coincidence. Hence T, P coincide at the instant Q*(T 0 ) ? 
which is Q(T 0 ) and lies between T 0 and T x . We therefore have a contradiction. There is a 
similar argument supposing that T, P coincide at T x , and the theorem is therefore proved. 

Corollary 1 .—If P is a member of <2^ then for all instants of T, either P(T X ) ~P*(T X ) 
or P(T X ) —P^~ X (T X ) } and the record functions P, Q, . . . of particles of are commutative 

in pairs . 

This follows from the definition of modified record functions and from the properties of 
commutative functions. 

Corollary 2 .—T has no first or last instant . 

For from elementary signal properties, T would coincide with every other particle at such 
an instant (cf. theorem 5.x). 

We can now prove the following important theorem:— 

Theorem 12 .2 .—No two particles of coincide except at their first a?id last instants . 

Lemma. — Particles P, Q of d^T are identical if P * 2 = Q #2 . 

For suppose that P*(Ta,) < Q^T^) for some T x . Then from the commutative property 
and the fact that P # , Q* are strictly increasing functions, we have 

p* 2 (t*) $ p*Q*(T>Q # p # W 5 QTO, 

which contradicts the given equality. Hence P*(Taj) = Q*(T a ) for all T x and the lemma is 
proved by theorem 8.3. 

Suppose now that P, Q are distinct particles of d^T which coincide, so that by theorem 8.3 
there is a T 0 for which P*(T 0 ) =Q*(T 0 ). By axiom F.3' let R be the reflection of T in P and 
let S be the reflection of R in Q. Then from theorem 10.1 and the commutation property we 
have R* = P *2 and 

S* = Q*R*-!Q* = Q*p*-2Q# = p#-2Q* 2> ( 2 ) 

From P*(T 0 ) — Q # (T 0 ) it follows that Q # P # ~ 2 Q*(T 0 ) = T 0 , so that S coincides with T at T 0 . 
Hence from theorem 12.1, S must be identical with T, whence from (2), P* 2 = Q* 2 . This, 
however, is impossible by our lemma, since P, Q are given distinct. Our supposition is 
therefore false, and the theorem is proved. 

We have now proved that Is a permanently ordered linear set. 

13. The Ordinal-Linearity of Time 

We are now in a position to deduce from the' fundamental axioms the following 
theorem:— 

Theorem 13.1.— The set T of instants is or dinally-linear, i.e. axiom T.Xw now true . 

Let | T n be a descending progression of instants with a limit, say T w , in T. Then by axiom 
F. 2 we can choose particles P in d^T such that 

n 

P(TJ < T n , P (T u ) < P(TJ (n = i, 2, . . .)• (3) 

« n+1 n 

Let S be the set of instants 

P^Tft,) (# = 1, 2, . . .,p~ o, ± 1, ± 2, . . .), 

n 

repetitions being excluded. Then S is denumerable, and we shall prove that this sub-set of T 
satisfies the conditions of axiom T. X, i.e . that between any two instants of T is a member of 2 . 
f The existence of such a progression follows from axiom F. 2. 



Foundations of Relativity: Parts I and II 331 

Let T a , T„ be any two instants (T* < T„). Then frona axiom F. 2 there is a particle P of 
o$\ such that 

T* < P(T X ) < T„. ( 4) 

Since P does not coincide with T at T M and since T u is the limit of the descending progression 
T„, it follows that there is an n such that T„ < P(TJ, whence from (3), P(TJ < P(TJ. We 

71 

deduce that P(T 4 ) < P(T 4 ) for all instants of T because by theorem 12.2, P, P do not coincide. 

71 n 

In particular, therefore, and from (4), 

P(Ta) < P(TJ < T v . (5) 

71 

Since P does not coincide with T, the ascending and descending progressions P r (T w ), 

« n 

P" r (T tt ) (r= 1, 2, . . .) have no limits in T, so that T is covered by the intervals 

(P’CL), P* +1 (TJ), 

where n is fixed and p — o, ± i, ± 2, . . . . There is therefore z.p such that 

P®(TJ <T, < P® +1 (TJ. (6) 

n n 

From this and (5), since P is an increasing function, we have 

P P+1 (T„) = PP y (TJ < P(T X ) < T„. (7) 

71 7171 71 

Finally, from (6) and (7), 

Tj. < P* +1 (T J < T„, 

71 

which shows as required that a member of £ lies between T x and T y . 

From the above theorem and the second corollary to theorem 12.1 we have at once, 
remembering that the set T' is closed, 

Theorem 13.2 .—The set T is ordinally similar to an open interval of the arithmetic 
continuum . 

Corollary 1 .—The set T' is ordinally similar to a closed interval of the continuum. 

Corollary 2 .—Every particle-set P f of instants is ordinally similar to a closed interval 
of the continuum . This follows from the fact that a signal correspondence is one-one and 
preserves order. 

We now see from the foregoing theorems that there is a one-one correlation between the 
instants of a particle P and the numbers of an open interval of the arithmetic continuum such 
that order is preserved, the relations “before” and “after” corresponding to “less than” and 
“greater than” respectively. Such a correlation we call a clock attached to P, and the word 
“instant ” can be used when referring to the corresponding number, or “ clock readingFrom 
the definition it follows that a clock reads continuous time. A clock attached to a particle is 
not unique, i.e. can be regraduated. A regraduation corresponds to a transformation of the 
form t'-if(t), where /, /' are old and new clock readings and xfj is continuous and monotonic 
increasing. 

14. t-Clocks and Spatial Distance 

For a given clock attached to T, each record and modified record function is equivalent 
to a c.m.i. (continuous monotonic increasing) function of a numerical variable t in some 
interval < a, b > . From the fundamental axioms and theorems it follows that the functions 
P*, Q*, . . . given by particles of can now regarded as functions of t and have the 
following properties:— 

(i) Each function is complete and node-free f in < a, b>. 

(ii) Each function commutes with each other function. 

(iii) The set of functions contains a sequence which converges uniformly to the identity 

function. 

t The present terminology, and the definition and study of related sets of functions referred to later, are 
given in Walker, 1946. A node of f(x) is a root of f(x)-x, and a function / is complete and node free in an 
interval < a, b > if it is c.m.i. in this interval and has nodes at a and b but at no point between. 



332 


A. G. Walker 


These are exactly the properties which functions must possess to ensure that they form a 
related set , as was shown in a recent study of commutative functions (Walker, 1946, § 13). 

A definitive property of a related set is that there is a function t/i, c.m.i. and with extreme 
values o and 00 in the interval < a, b >, such that each function of the set can be expressed 
in canonical form for some constant a > o. Applying this to the set of modified 

record functions and then regraduating T’s clock from t to r by means of the transformation 
t= log we finally get the following theorem:— 

Theorem 14.1 .—There is a clock attached to T and reading time r such that the modified 
recordfunction for every particle P of off[is of the form P*(r) s r + 2X, where x is a constant 
which varies for different particles. 

Definitions. —A clock satisfying the conditions of the above theorem will be called a 
r-clock , and for a given r-clock, the constant x in the theorem will be called the distance TP. 

We observe that the distance TP is positive, zero or negative according as P is to the right 
of, coincident with, or to the left of T. Also, from theorem 10.1, we see that P, Q are symmetric 
about R if the distances from T satisfy 

TP+TQ = 2 TR. (8) 

Theorem 14.2 .—For a given r-clock attached to T, the set X of distances from T to the 
particles of d^fis everywhere-dense in the continuum. 

To prove this we first see that axiom F. 2 implies that the set X contains a series x n (x n ¥* 6 ) 
such that Now from (8) and the complete symmetry of o^Twe see that if x belongs 

to X , then px belongs to X for all integers^. The theorem now follows from the fact that 
the set px n is contained in X and is everywhere-dense in the continuum. 

Since each member of -ST is a limit point, we have: 

Corollary. —The set dfFfis dense at every instant of every particle of the set. 

We now have a clear picture of as a one-dimensional space in which each point corre¬ 
sponds to a particle and is determined by its distance from a fixed point. Although the set X 
of distances is everywhere-dense, it is not necessarily closed; it could, for example, be denumer¬ 
able. If, however, X is not closed, then it is possible to define ideal particles, just as we defined 
ideal instants in § 3, so that they have the properties possessed by particles of an d so that 
when is augmented by their addition, then the augmented set X is closed. 

Without assuming that X has been augmented in this way, we shall prove the following 
theorem concerning the uniqueness of r-clocks:— 

Theorem 14.3 .—A r-clock is unique except for regraduations of the form r'=Ar + 3 
( A > o), i.e. arbitrary changes of unit and zero. All distances from T are affected proportion¬ 
ately when the unit is changed. 

Consider a regraduation r — <j>(r) where <j> is c.m.i. Then a record function r + 2x trans¬ 
forms into the new record function <£{2 x + ^(r')}, and this has the required form r + 2x f if 

< H 2X +y) = + 2x '> ( 9 ) 

where x' depends upon x but not upon y. The regraduation therefore gives a new T-clock if 
this equation is satisfied for ally and for all x of X. Writing 0 (r) = <£(r) -<£(o), then we find 
2x' — 6 (2x), and (9) becomes 


6 { 2x 4 -y) = d(2x) 4- 6 (y). 

This must be satisfied for all y and all x of X , and therefore all x since X is everywhere-dense 
and 6 must be continuous. Hence 8 (k) ~k 8 (i) for every rational number k, and hence for all 
k by continuity, so that <£(r) is of the form At +3 where A , 3 are constants. 

Conversely, it is at once seen that every regraduation of this kind gives a r-clock, a record 
function r + 2x transforming into the function r 4 - 2Ax. The original distances are thus 
multiplied by A to give the new distances, this then being the effect of a change of unit. 



Foundations of Relativity; Parts 1 and II 


333 


15. Construction of a t-Clock 

Having proved a number of existence theorems concerning T-clocks, we now come to the 
problem of constructing a T-clock directly from primitive observables, without first postulating 
an arbitrary clock and without having to solve complicated functional equations. There are 
two arbitrary elements in the construction of a T-clock, corresponding to the arbitrary constants 
in the affine transformation admitted by such a clock. We choose therefore any instant T 0 of T 
and any other particle P of and define a clock in terms of these; if T 0 or P are changed, 
then the clock will undergo an affine regraduation. 

Let Q be any particle of Then from the properties of o^fit follows that if /, q are 

integers, Q a (Tte) >, = or < P^T^.), and the relation is the same for all T a . Also, both 
inequalities occur for different choices of /, q. Hence a unique number A > o is defined as a 
section of the rationals by 

qX<p according as Q q (T x ) < P^CQ. 

This number A is an observable associated with T, P, and Q, and in particular, A = o when 
Q = T, A— 1 when Q = P, and X—p when Q = P 3> . 

A clock is now attached to T which is defined to read A q at the instant Q a (T 0 ), and so for all 
integers q and all particles Q of <2^. It can be verified that these clock-readings have the 
required numerical order in correspondence with temporal order, that they form an everywhere- 
dense set in the continuum, and that the instants they record form an everywhere-dense set in T. 
The extension to all other instants of T is therefore determined by continuity, and a complete 
clock has been constructed for T. 

That this is a T-clock can be verified directly or by means of the existence theorems of § 14. 
For if a T-clock is chosen to read o, 1 at instants T 0 , P(T 0 ) respectively, then P's record function 
is P(t)=t + i and from the above definition of A, Q’s record function is Q(t)=t + A. The 
reading of Q Q (T 0 ) is therefore Ay, and the T-clock agrees with our constructed clock at all these 
instants and therefore at all instants of T by continuity. The clock we constructed is therefore 
this T-clock. 


16. Similar t-Clocks and Simultaneity 

Having discussed clocks attached to one particle, we now come to the problem of attaching 
clocks to all particles of o^fso that they are in some sense u similar" and u synchronised" and 
give rise to a unique definition of simultaneity. We shall first define a clock for each particle 
in terms of a given T-clock attached to T and show that all these clocks are T-clocks with the 
desired properties. Later, in § 17, we shall examine the problem from a different point of view 
and deduce that the clocks previously defined are the only clocks having certain general 
properties. 

Suppose that a T-clock has been attached to T, and let P be a particle of e^fwhose distance 
from T is x. For any t, let T t be the instant whose clock reading is r, and let P T+CC be the 
instant of P on the right optical line through T r (i.e. T r A P T+ a; or T t V P r+a . according as P 
is to the right or left of T). Then we attach a clock to P which is defined to read t + x at the 
instant P T+£C , and so for every t. A clock is similarly attached to every other particle of <2/1. 

To prove that P's clock is a T-clock, let Q be any other particle of d^and let the distance TQ 
be y. Then the modified record functions of P and Q with respect to T are t + 2x , r + 2y 
respectively, and from the lemma of theorem 10.1 we deduce at once that the modified record 
function of Q with respect to P is t + 2(y — x) in terms of P's time t. This is of the required 
form, so that P's clock is a T-clock. 

From the function r + 2(y-x) derived above we deduce that if PQ is the distance of Q 
from P defined in relation to P's clock, then PQ = TQ —TP. In particular, PT= - TP, 
i.e. T is the same numerical distance from P as P is from T. This fact leads us to describe 
the clocks attached to T and P as similar . 

If now we regraduate T's clock by means of t —At- {-P, then the new distance TP is 
x' — Ax , and we have 


t' +x f =*At+B+Ax=A(t+x)+B. 



A. G. Walker 


334 

This shows that when T’s clock is regraduated affinely, then P’s clock undergoes exactly the 
same regraduation, and all distances are multiplied by the same constant. 

Simultaneity between T and P is defined as the one-one correspondence in which instants 
with the same clock-reading correspond, P’s clock being defined in terms of T’s as described 
above. It is easily deduced from our definitions that these correspondences are transitive, 
i.e. that instants of P and Q are simultaneous if they are both simultaneous with the same 
instant of T. We also see that the simultaneity correspondence between any two particles of 
O^Tis independent of the particular r-clock attached to T, since all clocks undergo the same 
affine transformation when T’s clock is regraduated. 


17. General Simultaneity 

A general definition of simultaneity, which closely resembles Milne’s definition of 
equivalence, is as follows:— 

Definition. —A simultaneity between particles P, Q (not necessarily fundamental) is a 
one-one correspondence between their instants, which we write as P^ ~ Q^, such that 

(i) Pa? < Py> Pa? Qa and P v ^ Q y O Q* < Qyj 

(ii) F x — Q x , F x A Q y and Q^AP^OP^Q y . 

We see from (i) that the correspondence is ordinal and continuous. Also, if P^ ~ Q Xi then 
Qx ~ 

If, for a given simultaneity, F x ~ Q x A F y) then we write F y - 8 (P X ) } and call the function 9 
defined in this way a signal function, a name which arises out of Whitrow’s study of equivalences. 
We observe that the same function is given by F x A Q y ~ F y . A signal function is clearly 
continuous and monotonic increasing. 

Writing P V = 9 (F X ) and F z = 9 (F y ) = 9 *(P x ) 7 then if P^Q^, we have P* A Q y A F z , 
i.e. F y = Q(F x ) in the notation of record functions. Hence, for all P^, 

e\ p*)= Q( p*). (10) 

Conversely every c.m.i. function 9 which satisfies (10) gives rise to a simultaneity between 
P and Q, the correspondence being F x ~ Qa, where Q^. is given by Q x A 6 (F X ). 

This last result shows how a simultaneity can be constructed and how arbitrary it is. A 
solution of the functional equation (10) for 6 may not exist when the ordered set P is non-linear, 
but always exists when P is linear (Milne and Whitrow, 1938), as is now the case from the 
second corollary of theorem 13.2. 

We shall say that a set of particles admits simultaneity if simultaneities can be found for 
all pairs such that they are transitive, i.e. P^ ~ and P^ ~ R x imply ~ R x . We shall not 
discuss such sets at length, but it can be verified with the aid of diagrams that if T, P, Q 
are collinear and admit simultaneity, then 

(i) if Ta, ^ F x ~ the three particles have the same order on every optical line 

through Tjj., P ffl , or Q*; 

(ii) 6 (f) -<f >9 where 9 , <f> are the signal functions of P, Q respectively with respect to T. 

Since 9 2 = P and <£ 2 = Q, it follows from (ii) that PQ = QP and then from (i) that 
P # Q # -Q*P # . Hence: 

Theorem 17. I *—A collinear set of particles which admits simultaneity is commutative. 

It is easily seen that commutation is less restrictive than simultaneity, for p*Q* = Q*p* 
does not imply PQ = QP. Nor even does PQ = QP imply the existence of 0 , <f> such that 

0 2 = P, ^ 2 = Q, 94 > = 4 > 9 . (ix) 

We also notice that for a set of particles, simultaneity is not a primitive observable. The 
primitive (“ observable ”) functions are the record functions, and the question of simultaneity 
depends upon whether solutions of equations similar to (11) for 9 and <f> do or do not exist. This 
cannot be tested by a sequence of primitive experiments. 



Foundations of Relativity; Parts I and II 335 

Turning now to the set <3^ suppose that a T-clock has been attached to T. Then for a 
particle P of <9^ we have P*(t)=t + 2* whence P(t)«t + 2 | * |, so that TQ = QP for all 
pairs. 

For simultaneity, we want to find if possible a 0 for each P such that 0 2 =*P and all the 0 ’s 
commute in pairs, each 0 being c.m.i. Since 0 <£ = <£0 implies 0<£ 2 = <£ 2 0 , it follows that each 0 
must commute with all the P’s, i.e. with all functions of the form r + h (k > o). Hence each 
0 must satisfy 

0(T+i) = 0(r)+£ 

for all r and all k > o, and this leads at once to the form 0(t) =t 4- A, where A is any constant. 
Substituting now in 0 2 = P=r + 2 | # |, we find that the corresponding 0 (r) must be r + \ x ]. 
Thus the signal function for each particle is unique and defines exactly the simultaneity given 
in § 16. These simultaneities are clearly transitive, and we have: 

Theorem 17*2 .—The set admits simultaneity in a unique way , this being equivalent 
to the definition given in § 16. 

18. Particles Collinear with Fundamental Particles 

We have not so far considered, in relation to time keeping, particles which do not belong 
to an 9^". The general case will be discussed in a later paper, but we can now prove the 
following:— 

Theorem 18.1. — If a particle A is collinear with those of then A coincides with a 
particle of Q^[at every instant of A, (It is assumed here that the set o^has been augmented 
as described in § 14.) 

Let T be a particle of Then for any instant A x of A there is an instant T x of T which 
lies on the right optical line through A x . From the spatial properties of described in § 14 
there is for any instant T y a particle P of <2^Tsuch that P^T*,) = T y , and taking T v to be A^Tp), 
it follows that there is a P such that P*(T,J == A*(Tg). Hence from theorem 8.3, A coincides 
with P at A x . 

When similar r-clocks have been attached to the particles of 9^T*then a unique clock can be 
attached to any particle A which is collinear with those of 9^T For any instant A x of A, let P 
be the member of e^Twhich coincides with A at A,*. and let A^ X P^. Then A’s clock is defined 
so that the reading at A^ is the clock-reading of P at P^. It is easily verified that the clock 
so defined has the desired order and continuity properties. 

The position of a particle A such as the above can be specified at any instant A x by defining 
its distance at this instant from a fixed particle T of 9^ This is simply the distance from T to 
the particle of o^Twhich coincides with A at A x . Thus finally there is associated with each 
instant of A a clock-reading, r, and a spatial co-ordinate, x. These numbers are also given 
directly in terms of T’s clock-readings by r—i( r i+ T o), I x I — K t i~' t o)j where r 0 , r t are the 
readings of instants To, T x such that T 0 A Aa, A T v This agrees with Milne’s conventions 
for measuring epoch and distance (with c = 1). 

The “ motion ” of A (relative to is described by the functional relation of x to r, and it 

follows from our axioms that such a function is single-valued and continuous. 

This completes our discussion of linear sub-sets of fundamental particles. 


REFERENCES TO LITERATURE 

HOBSON, E. W., 1921. The Theory of Functions of a Real Variable, 1, Cambridge. 
McVittie G. C,, 1942. Proc. Roy . Soc. Edin A, LXI, 210. 

Milne, E. A., and Whitrow, G. J., 1938. Zeits. fur Astrophys XV, 270. 
SlERPlNSKl, W., 1928. Nombres Transfinis , Paris. 

Walker, A. G., 1946. Quart. Journ. Math., xvii, 65. 

-, 1947. Rev. Scientifique , LXXXV, 131. 

(Issued separately May 12, 1948) 



( 336 ) 


XXXV.— Graphite Crystals and Crystallites. I. Binding Energies in Small 
Crystal Layers. By Mary Bradburn (Royal Holloway College, London 
University), C. A. Coulson (Wheatstone Physics Laboratory, King’s College, 
London), and G. S. Rushbrooke (Chemistry Department, Leeds University). 
(With Two Text-figures.) 

(MS. received September 18, 1946. Read February 3, 1947) 
i. Introduction* 

The process of building up a crystal or metallic lattice from the individual atoms is a very 
complex one: at present it is not properly understood theoretically. In part this is due to 
difficulties in dealing with large, but not infinitely large, systems. Thus, on the one hand the 
infinite lattice has been fully discussed, and on the other hand so also has the small aggregate 
such as the diatomic molecule. But between these two extremes there lies an important field 
for which practically no theoretical results are known. Indeed the only work of this kind 
seems to be that of Taylor, Eyring and Sherman (1933), who considered systems containing 
up to 8 atoms in estates; although this work is interesting it is of little value for discussing 
the properties of the finite crystallites which exist with many more atoms than 8. In this 
connection it seems generally agreed now that in the process of crystallisation the nuclei around 
which the larger crystals grow never contain less than 25 atoms. And, moreover, the Heitler- 
London method used by these writers is not readily extended to larger aggregates. 

These larger aggregates, which we shall refer to as crystallites, occur both in metallic and 
non-metallic systems. They have a particular importance in the case of carbon, where, in the 
pyrolysis of natural coals, cellulose, lignin and certain large aromatic molecules (Blayden, 
Gibson and Riley, 1943), chars are obtained containing layers of carbon atoms with between 
38 and 150 atoms in a plane. In view of the importance of these crystallites in metallurgical 
and other practice, we have studied their properties and we present in this paper an account of 
our work. The methods may be applied, with a little more difficulty, to other systems. But 
there is an advantage in dealing first with the graphite crystallites: for X-ray evidence makes 
it very clear that these small aggregates resemble the large graphite crystal in having a planar 
structure, so that in each layer plane there is the familiar hexagonal pattern of carbon atoms, 
with a C—C bond length about 1*42 A., whereas the individual planes are separated by a much 
greater distance of some 3*4 A. This large distance shows that the forces between the planes 
are largely non-specific Van der Waals forces. Along any one plane there are stronger forces 
of metallic or molecular valence type. This conclusion is reinforced by the fact that in many 
cases there is no correlation between the individual planes, which occupy random relative 
positions so far as translations and rotations are concerned (Warren’s turbostratic system, 
1941). For this reason we can, provisionally, neglect all inter-layer effects and consider only 
the single layers separately. This does not mean that these other effects are really non¬ 
existent, but rather that we must proceed stage by stage, so that a proper understanding of the 
forces between the atoms in one plane must precede any discussion of the forces between 
neighbouring planes. 

This decision greatly simplifies our problem. For a crystallite layer with 50 to 100 carbon 
atoms arranged hexagonally is like a huge condensed aromatic molecule. In fact it lies 
between a molecule like coronene (24 carbon atoms) and the infinite graphite layer plane. 
Now it has been shown that the methods of wave mechanics may be applied successfully to a 
study of coronene (Coulson, 1944) and to the infinite graphite layer (to be published). We may 

* The work contained in this paper was largely done when the authors were members of the staff of the 
Mathematics Department of University College, Dundee. 



Graphite Crystals and Crystallites 337 

therefore confidently expect the same methods to work satisfactorily with these crystallites of 
intermediate size. 

There are many problems associated with these layer systems. In this paper we shall 
discuss the binding energy in terms of the size and shape of the crystallite; this will lead us to 
an estimate of the C—C bond length and its dependence upon shape. In this way we shall get 
some indication of the manner in which the crystallites build up to form a larger layer in the 
process of graphitisation. Finally, we shall consider the question of chemical stability of these 
crystallites with regard to hydrogen addition. The conclusions of this latter discussion, 
however, are less reliable than the earlier ones, since they involve the absolute, instead of the 
relative, magnitudes of certain constants whose values are not known with certainty. 


2. Details of Model Assumed 

We must now describe more fully the model that we use to represent the crystallite layer. 
We shall suppose that there are N carbon atoms arranged hexagonally in a plane according 
to the scheme shown in fig. 1. We suppose also (i) that all the C—C distances are equal, and 

ti*) U 2 ) ( 7 , 3 ) ( 14 ) ( 7 , 2 m) 





(ii) that all the carbon atoms are in the trigonal state first introduced by Pauling. (For a 
description of this state, see Coulson, 1941). Now in the first place it is almost certain that the 
C—C links are not all equal in length throughout any one layer, and in a later paper one of us 
will discuss this effect with particular reference to approximately circular molecules of the form 
C x B. y , where x = 6 n 2 , y = 6 n, and 72 = 1, 2, ... It is sufficient for the present to state that 
although there is greater disparity between the links near the edge of the crystallite than in 
the centre, the effect is too small to make any serious difference to our conclusions in the present 
paper; a detailed allowance for this variation would make our calculations quite impracticable. 
In the second place, assumption (ii) requires us to suppose that the boundary, or surface, 
carbon atoms have their third trigonal valence satisfied. For that reason we shall imagine 
that these edge atoms have each one attached hydrogen atom. This makes them resemble 
the condensed hydrocarbon molecules which have often been discussed before with consider¬ 
able success (Lennard-Jones and Coulson, 1939). This assumption, without which it would 
at present be impossible to make our calculations, is quite reasonable, though it is certainly 
not completely true to the actual situation. For, although Riley (1939) has shown that at 
fairly low temperatures (e.g. < 6oo° C.) there is enough hydrogen present to saturate all the 
edge valencies, we have nevertheless neglected the possibility, which certainly occurs to some 
extent, of acetylenic “tails” such as—C = CH, and we have also disregarded completely the 
effect of the remaining inorganic matter, chiefly oxygen, which is known to play an important 






338 Mary Bradburn , C. A. Coulson and G . 5 . Rushbrooke 

part in the chemistry of coal. However, it is very probable that this latter effect is produced 
largely on the matrix within which the layer crystallites are embedded; in any case, until more 
experimental evidence is available with regard to the location and function of these inorganic 
substances, it is not possible to include them in our discussion. 

To summarise—we discuss the energies of the electrons in large condensed systems such 
as those of fig. 1, supposing that all the C—C bonds are equal in length and all the carbon 
atoms are in the trigonal, or aromatic state with three localised single bonds, and one mobile, 
or 7r-electron free to move over the carbon framework. The non-mobile electrons, responsible 
for the basic single bonds, are sufficiently localised for us to treat them as ordinary molecular 
valence bond electrons, making due allowance, when necessary, for their compression beyond 
the normal C—C bond length. To the energy of these basic bonds we have to add that of the 
mobile w-electrons whose wave functions have a nodal surface coincident with the layer plane. 
Our main problem is to calculate this latter energy. 

A word is necessary concerning the shapes of these pseudo-molecules. These have been 
chosen so that the secular equation is soluble; and this condition severely restricts the possible 
arrangements of the N carbon nuclei. We have chosen the rectangular shape shown in fig. 1 
because it is then possible to use some results of D. E. Rutherford (1947) without further 
analysis. By varying m and n we are able to give a wide variety to the crystallite shape, and 
in particular we are able to answer the important question: does a layer of this type build up 
more easily in roughly square or filamentous shape? It will be recognised that m and n 
denote the numbers of hexagons along the two sides of the molecule. Thus m = 1 denotes 
the polyacene series with particular cases naphthalene (m = 1, n = 2), anthracene (m=i, n — 3); 
and n~i denotes the polyphenyl series with particular cases diphenyl (2, 1) and jfr-terphenyl 
(3, *)• We may also note that perylene is (2, 2) and chalkacene is (3, 2). 

If N, B, H denote the total number of carbon atoms, C—C bonds and complete hexagons 
respectively, it is not difficult to show that 

N = 2m(2n +1), 

B = 6mn -f m -n, 

H = 2mn - m - n +1. ( 1) 


3. Calculations 

Our method of calculation follows closely the method of molecular orbitals (Lennard-Jones 
and Coulson, 1939), in which we suppose each of the N mobile electrons to have a molecular 
orbit embracing all the nuclei, and represented to our degree of approximation, by a linear 
sum of separate atomic orbits. Thus if t/j rs denotes the atomic 2p z orbit of the carbon atom 
labelled (r, s) in fig. 1, the 0-axis being directed normal to the layer plane, we write for the 
molecular orbital, 

2m 2 ti+1 

’ i '’ = 2 X trtfn, (2) 

l/=l 5 = 1 

where the coefficients c r$} which have a definite set of values for each allowed energy level, are 
given by solution of the secular equations. There is one secular equation for each of the N 
carbon atoms. As usual, electrons are allotted to these orbits, two at a time, with opposed 
spins, so that the Pauli Exclusion Principle is satisfied, in order of increasing energy. To 
this degree of approximation the energies are additive, so that the total energy is just the 
sum of the component energies. There are, of course, N allowed energies in this “mobile 
shell”; but, by the “starring process” theorem of Coulson and Rushbrooke (1940) they occur 
symmetrically about the middle. And thus the lower half (bonding) of the mobile shell is, 
in each case, completely filled, the upper half (anti-bonding) being completely empty except 
in excited states with which we shall not be concerned. 

The secular equations which give the energy E and the coefficients c rs in (2) are 

*r«(E 0 “ E) + 2, 2 c i$ifrrs = ° J 1 <t < 2m , I <j < 2^ + 1, 


(3) 



339 


Graphite Crystals and Crystallites 
where, if H is the effective Hamiltonian for each separate electron, 

J*Ar$ = E 0 for all (r } s), 

S l Prs^4 f i3^ T ~ ftij) if (r, s) and (i,j) are neighbours, 

= o otherwise. (4) 

There is one equation (3) for each allowed value of r and s. The dash ' in the summation 
denotes that we omit the terms i—r^j^s. E 0 is the energy of one of these ^-electrons when 
confined to one nucleus, and j 3 is the so-called resonance energy for a single electron jump to a 
neighbouring nucleus, assumed to have the same value for every pair of neighbours. In this 
approximation we neglect the overlap integral, putting 

$$rs'Pijd' T ~ 1 if ( r j s ) is the same as (i,j), 

— o otherwise. (5) 

In a later paragraph we shall re-examine the legitimacy of this neglect of the overlap between 
nearest neighbours. 

With these approximations the summation in (3) is merely over the neighbours of (r, s). If 
(r, s) is an internal atom there are three neighbours; if it is a boundary atom there are only two. 
These types are illustrated by the typical examples below, in which, again, the numbering of 
the atoms follows that of fig. 1. 


Internal atom (2, 3) 
Boundary atom (1, 3) 
If we put 


^23(-^0 “ E) "h ^ 13 $ "h ^ 22 $ ^33j^ — O. 


^ 13(^0 “ E ) + 0^/3 + ^ 23 ^ = °* 


€ = binding energy = E 0 - E, 
*-«/£» 


these two equations may be written 


( 6 ) 

( 7 ) 


+ ^13 + ^22 ^33 ” 

* 18 * + * 14+*28 ( 3 ) 

There are N such equations. If we eliminate the N coefficients c rs we obtain a determinantal 
equation (secular equation) of order N. It is a rather formidable determinant, which we shall 
not write down explicitly, but Rutherford (1947) has shown that it may be evaluated. He 
shows that if we introduce quantities z k and <f) k defined by 

kit 

Z k = 2 COS - -r, 

* 2(02 + 1) 

X 2 = I + Zj? + 2 Z k COS <f) k , (9) 

then the value of the determinant is 


. , v It z k 2m+1 sin (2m + i)<b k + z k 2m sin 2mJ> k 

A=(x 2 -i) m Il — ---P-i— -— 

k=l z k sm cf> k 

The roots of the secular equation A —o are 

x 2 = 1 (m times), 


(10) 


(xi) 


together with the roots (other than <f> = o or tt) of the n equations 

z k sin (2002 + i)<j> k + sin 2002^. = o, ^ = 1, 2, ... 02. (12) 

The total number of roots of all these equations must be N. Now from (1) N=400202 + 2002. 
But the equation (11) contributes 2002 roots, and each of the equations (12) contributes 2002 roots 
of (j> k} that is, 4002 roots of x. This latter fact is most easily seen by expanding 

z sin (2002 + i)<f> + sin 2002^ 



340 Mary Bradburn , C. A . Coulson and G. S. Rushbrooke 

as a power series in cos <j> of degree 2m. By combining all the roots thus obtained we complete 
the necessary total N. It is well known that all the roots of real Hermitean determinants of 
this kind are real. 

Our procedure has been to choose values of m and n (i.e. to choose a certain length and 
breadth for the molecule in fig. 1) and then to solve the equations (12) numerically. In this 
way we determine the individual energies in terms of the resonance integral /?. By summing 
over the occupied orbits we obtain the total mobile binding energy E<*. Let us denote this by 
<S, This quantity is important for two reasons. First, it allows us to calculate &/ N, which is 
the total mobile binding energy per carbon atom. Secondly, it allows us to calculate <£/B, 
which is the total mobile binding energy per carbon-carbon bond. 

<$/N is not itself a measure of the resonance energy, but this quantity may easily be calculated 
as follows. In any single Kekule structure each carbon atom is attached to one end of a 
double bond, so that the number of double bonds is JN. If these double bonds were “ fixed ”, 
the remaining bonds being pure single bonds, their binding energy would be x 2 / 3 *=NjB. 
The difference between this and the calculated binding energy, viz. <$- NjS, is sometimes 
called the resonance energy, but is more aptly called the delocalisation energy, since it measures, 
at least approximately, the gain in energy due to the delocalising, or metallic character, of these 
electrons. Now a table of &/ Nj8 is given below (Table I) for various values of m and n. The 
resonance energy per atom is found in terms of /? by subtracting unity from each entry in the 
table. 


Table I.— Values of &/ N 0 . (Mean Mobile Binding Energy per C Atom) 


I 

2 

3 

4 

5 

6 

. 7 8 

15 

19 

00 

m 

1 

1-3333 

1*3684 

1*3796 

1-3851 

1*3886 

1-3906 

1-3921 1-3935 

1-3976 

1-3987 

1*4028 

2 

I- 36 S 3 

1*4122 

i* 43 H 

1-4424 


1-4547 

1-4584 



1*4826 

3 

1-3762 

1*4279 

1*4507 


i’ 473 i 

1-4789 

I- 483 S 



1*5125 

4 

I- 38 I 7 



1-4751 


1-4917 





5 

1-3851 


1*4677 








6 

1-3872 

1*4442 









7 

1-3889 










8 

1-3899 


1*4774 




1-5167 

1-5342 



CO 

I -3983 

1*4612 

1 *4960 

1-5136 

1*5240 





1-5761 


This table also shows certain values for the cases when m or n is infinite. These are obtained 
somewhat differently, and are discussed in paper II. 

It may be objected that no allowance has been made for the compression energy of the 
basic single C—C bonds. As the bond lengths do not change by much from molecule to 
molecule, this compression energy will be about constant. Thus although Mulliken, Rieke 
and Brown (1941 and later papers) have shown that this energy is a significant fraction of the 
total binding energy, we may safely regard it as included in the empirically determined 
resonance energy j8. Thus, if we may anticipate a little, we may use the bond lengths shortly 
to be deduced, and collected in Table III, to estimate the variations in compressional energy 
among the molecules of Table I. If we omit the smallest molecule of this system given by 
w = 1, «= 1, the mean bond length varies over a total range of 0-02 A. If we consider only 
those molecules for which m and n exceed 3, the range of bond length is 0*008 A. Now 
Mulliken, Rieke and Brown (1941) have shown that in the benzene molecule this compressional 
energy amounts to 35 KCals per mole for the six bonds. This suggests, on a proportional 
basis, that if the mean bond length varies over a range of 0*02 A., variations of j8 do not exceed 
i KCal, and if the range is less than 0*008 A., the variations are less than 5 KCal. These 
changes in j8 are smaller than is implied in the approximation of our model in section 2, and 
justify us, in this present connection, in neglecting the variations of compressional energy from 
one molecule to another. 

Table II shows the corresponding values of &/ 2B/3. Now, according to the theory developed 
by Coulson (1939), <£/B is immediately related to the average order of the C—C bonds. In 
fact, the mean mobile bond order p is given by/ = d/2B|8. This is the quantity shown in 



Graphite Crystals and Crystallites 341 

Table II. If we know the bond order we can -use the curve of order against length to deduce 
the corresponding bond length. In Table III these mean lengths are shown measured in A. 
units. There is still some uncertainty with regard to the lengths of the fundamental C—C, 
C = C and C = C bonds, particularly the C = C bond. But if we take these as 1*542, 1*331 and 
1*202 A. respectively, we obtain the bond lengths given in Table III. In this table we give 
the lengths to o*ooi A. The uncertainty in C = C means that no reliance whatever may be 
placed upon the absolute values shown in the table. But there is every reason to believe that 
the relative values are quite correctly spaced. It may indeed easily happen that the absolute 
values are not far wrong, for the bond length in the full graphite layer = «, calculations 

in a later paper) is predicted to be 1*416 A., and observed values are 1*415 A. (Taylor, 1941) 
or 1*421 A. (Nelson and Riley, 1945). But even if this almost exact coincidence of values is 
fortuitous, we may safely claim that the differences between the mean bond lengths for different 
combinations of m and n , as given in Table III, are significant. It will be recognised that the 
figures given in these tables are entirely independent of the precise numerical value of ]8. 


Table II.—Values of <£/2B£. (Mean Mobile Order of Carbon Bonds) 


\^n 1 

2 

3 

4 

5 

6 

7 

8 

15 

19 

00 

m 

1 

0-6667 

0-6220 

0-6036 

0-5936 

0-5875 

0-5832 

0-5800 

0-5778 

0-5701 

0-5682 

0-5611 

2 

0-6301 

0-5884 

0*5726 

0-5644 


0-5562 

0-5538 




0-5389 

3 

0-6194 

0-5789 

0-5642 


0-5524 

o *5493 

0*5472 




0-5338 

4 

0*6141 



0-5532 


0-5463 






5 

o-6m 


0-5584 









6 

0-6090 

0-5701 










7 

0-6076 











8 

0*6065 


0*5553 




0-5401 


0-5336 



00 

0-5993 

0-5620 

0*5512 

0-5449 

0-5408 






0-5254 



Table 

III.— Mean Carbon-Carbon Bond Lengths (A.) 



\ 

n I 

2 

3 

4 

5 

6 

7 

8 

15 

19 

00 


m 


1 

1*389 

i *397 

1-400 

1-402 

1-403 

1-404 

1-405 1-405 

1*407 

1*408 1-413 

2 

1*396 

1*403 

1*406 

1-408 


1-409 

1-410 


1-415 

3 

1*399 

1*404 

1*408 


1*411 

1*411 

1-412 


1-415 

4 

1*399 



1*410 


1-412 




5 

i *399 


1-409 







6 

1*399 

1*406 








7 

1*400 









8 

1*400 


1*410 




1*413 

1*415 


00 

1-402 

1*409 

1*411 

1*412 

1*413 




1-416 


4. Deductions from Results of Tables I—III 

We may make several deductions from the values shown in Tables I-III. 

(a) Table I shows that an increase in size of crystallite always leads to a greater binding 
energy per atom. This is found along every column and every row of the table. We shall 
shortly compare this with the results of Taylor, Eyring and Sherman, who found that 5 sodium 
atoms were less stable than 4. Our table shows that after the first few atoms are put together 
<£/Nj3 changes only very slowly from structure to structure, gradually increasing with the size 
of the molecule. This increase, to be sure, has only been established for a change from one of 
our regular symmetrical structures to the next one in either direction: and at first, with small 
molecules, we may expect an irregular variation of < 5 /Nj 8 when we add 1 or 2 further atoms* 
Such erratic behaviour, however, would not be anticipated when the structures are larger, and 
we may suppose that after a certain minimum size <£/N/J becomes a smooth function of N. 



342 Mary Bradbtirn , C. A . Coulson and G . Rushbrooke 

Combining this conclusion with the work of Taylor, Eyring and Sherman, it follows that it is 
not until a crystallite has built up to at least 20 or 30 atoms that it possesses the power of 
attracting further atoms with increasing facility. This is in keeping with a good deal of 
evidence summarised by Taylor, Eyring and Sherman, and fits in with the minimum number 
of about 25 atoms for crystallisation nuclei. 

These calculations, we must emphasise, refer to the resonance energy of the 7r-electrons, 
but they would probably be valid for other types of binding. However, we must exercise a 
little caution in extending them directly. For in our case only one half of the allowed levels 
of the molecular ^-electrons are filled: in terms of the familiar theory of metals, the energy 
band from the ^-electrons is only half filled, since it will accommodate two electrons per atom, 
and our systems provide only one. If we had completely filled the band, our approximation 
would have given a total energy equal to NE 0 , and resonance would have played no part in the 
total binding. Thus what we have shown is that with an incompletely filled, or half-filled, 
band, the larger the layer the more bonding it becomes. 

(b) If we consider the two homologous series represented by m = 1 (polyacenes) and n — 1 
(polyphenyls), we see from Tables I and II that when we compare systems with an equal 
number of hexagons, as given by the formula in (1), then the resonance energy per atom is 
slightly greater for the polyacene series than for the polyphenyl series, but that the energy per 
bond is distinctly less. However, when we consider, in section 5, the effect of introducing 
the overlap integral, previously neglected, we find that both the energy per bond and the energy 
per atom are greater for the polyphenyl series. It seems most likely that this latter calculation 
is the more accurate. This greater stability of the polyphenyl systems may be attributed to 
the larger number of unexcited Kekule structures which are possible. 

(c) Table III shows that although there is a distinct increase in the length of the carbon- 
carbon bond as the size of the crystallite increases, this increase is not large. By the time that 
as many as 10 hexagons have been built together, the mean bond length differs from that of 
the infinite layer by not more than 0*01 A. This is in general agreement with the experimental 
results of Taylor (1941). J. H. de Boer (1940) has deduced a similar increase as the crystallite 
increases in size, but his argument, which differs from ours, predicts considerably larger bond 
length changes than now seem to occur. De Boer’s argument appears not to make an adequate 
distinction between a- and 7r-type electrons on the boundary atoms. 

(< d) Table II shows that increasing the size of the crystallite in either direction reduces the 
order of the bond, so that, as Table III shows, the carbon-carbon distance is actually increased. 
This is, at first, a rather surprising condition, for we should have expected greater resonance to 
be associated with shorter bonds. It is due to the fact that the resonance energy per atom 
depends on <£/N, whereas the mean bond order depends on <$/B. As (1) shows, N and B vary 
differently in terms of m and n. 

(e) Table I allows us to discuss the effect of resonance energy on the shape; and in particular 
to answer the question: if the crystallite layer contains N carbon atoms, will the resonance, or 
mobile, character of the w-electrons favour a long, thin, filamentous structure or a more nearly 
square one? It is most important to recognise that there are other factors as well as the 
resonance energy which help to determine the shape. Thus the availability of other atoms 
(H, N, 0 ) to saturate the edge trigonal valencies, the relative strengths and numbers of these 
bonds in relation to the basic single C—C bonds, are factors that need to be considered. 
We shall make some rather tentative suggestions regarding their effects later in this paper. 
But it is desirable to distinguish these various factors and to consider separately their influence 
on the shape. 

Now if N is constant, the permitted values of m and n are such as to lie on approximately 
parallel curves running from bottom left to top right of Table I. It is evident that the resonance 
energy per atom is greatest somewhere near the middle of such lines. We conclude that 
resonance, considered as an effect by itself, prefers an approximately square, or circular 
structure rather than a long strip either of polyphenyl or polyacene type. 

We may make this discussion of the effect of resonance on shape more explicit by comparing 
together three different structures for all of which N = 66. If we refer to the structure in terms 
of (m> «), these are (n, 1), (x, 16) and (3, 5). The first of these is a polyphenyl chain of 11 
hexagons, the second is a condensed polyacene with 16 hexagons side by side, and the third is 



Graphite Crystals and Crystallites 


very nearly square. The resonance energies per atom are now strictly comparable, 
are shown below. 


Polyplienyl 

i) 

Resonance energy 

per atom . 0*392 


Polyacene 
(x, 16) 

0-398 £ 


Square 

(3, s) 

°-473 P 


343 

They 


This makes it quite clear that resonance itself gives a strong tendency to favour a square rather 
than an oblong configuration. If j 3 is taken to be about 20 KCals, the energy difference above 
is of the order of 1*5 KCals per atom. This is significantly large. 


(00,00) 



This dependence of < 2 /N /3 upon the values of m and n may be exhibited graphically by 
regarding d/N /3 as a function of m and n, and plotting a surface in which the base plane contains 
the m and n axes, and in which the ordinate is the value of <$/Nj 3 . This ordinate is, of course, 
strictly determined only for integral m and n, but as the perspective diagram of fig. 2 shows, 
the calculated points lie on a smooth surface. In fig. 2 the height of the basal plane corre¬ 
sponds to &/ N/J = 1*3. The thin lines are lines of constant m and n along the surface, and 
the line of greatest slope (i.e. the line joining points of greatest resonance energy per atom for a 
given total number of atoms N) is shown along the diagonal. The crosses mark the calculated 
points. 

(J) The variations in < 2 /N /3 shown in Table I imply that the heats of combustion of various 
shapes and sizes of crystallite are not by any means identical. If allowance is not made for 
this, significant errors may be made. For example, if we consider the (3, 5) structure, for 
which 1*4731 and compare it with the infinite layer for which <£/N/?= 1*5761, or the 

(8, 15) structure where <$/Nj8= 1*5342, and if we take £ = 20 KCals approximately, the corre¬ 
sponding differences in heat of combustion or sublimation amount to about 2*1 and 1*2 KCals 
per mole respectively. It is interesting to note here that the crystal layers found by Riley 












344 Mary Bradhurn , C. A. Coulson and G. S. Rushbrooke 

(1939) actually have sizes intermediate between those of the (3, 5) and (8, 15) structures. 
Such differences in energy are within the accuracy of modern precision calorimetry. If the 
heat of sublimation of graphite is taken to be 124-1 KCals per mole (Herzberg, 1937) they 
amount to between 1 and 2 per cent. 

Now it happens that there are some extremely careful experiments of Dewey and Harper 
(1938) which provide some indication of the validity of our work. These writers prepared 
anthracite cokes at temperatures between 900 and 1300° C., and they found a steadily diminish¬ 
ing hydrogen content. Riley (1939) would have expressed this by saying that the cokes 
prepared at the higher temperatures had a larger condensed aromatic structure than those at 
the lower temperature. According to our present account this should imply a larger resonance 
energy, and hence a smaller heat of combustion. Dewey and Harper measured the heat of 
combustion as a function of the percentage of hydrogen and found a smooth variation. In 
particular, the difference in heat of combustion between cokes prepared at 900° and 1300° was 
no less than 2*4 KCals per mole. Assuming that the hydrogen had the s^me heat of combustion 
as ordinary gaseous hydrogen, they calculate that this difference would have been expected to 
be 3*0 KCals per mole. The remaining o-6 KCals per mole, which is considerably greater than 
the experimental error, must lie in a difference in energy between the two cokes. This will 
itself arise partly from the increased size, so that we may expect differences of this order of 
magnitude in the resonance energy of the two sizes of layer. In the case of graphite layers, 
a complete numerical correlation cannot be expected till we have a fuller understanding of the 
proportion and location of the small amounts of inorganic matter present in the cokes, and of 
the nature of the intercrystalline boundaries. But it is satisfactory that the order of magnitude 
of energy difference that we predict theoretically agrees so nicely with that actually obtained. 

Another rough comparison of order of magnitude may be made with some equally accurate 
experiments of Jessup (1938), who measured the heat of combustion of diamond and found 
variations of 0-12 KCals per mole between diamonds of size 2-5 /x and 39*5 /*. Only one 
quarter of this could be attributed to free valencies at the edge of the crystallite; as there are 
no foreign atoms present, the rest must presumably be largely due to differences in resonance 
energy on account of the difference in size. And, in fact, the direction of the change is in 
accord with this interpretation. Strict comparison is not possible because diamond is a three- 
dimensional crystal and we have dealt only with two-dimensional ones. Also, the sizes of 
Jessup’s crystallites were rather larger than those for which we have made our calculations. 
But the importance of the comparison is that it shows again that our order of magnitude for 
change of resonance energy with size is indeed correct. Further work on this and other 
crystallites may enable a more precise test of the absolute magnitude to be made. 

(g) The considerations above lead to the further question: how large must a layer of 
carbon atoms be before it may be regarded as graphite and not as a large molecule approxi¬ 
mating to graphite? The answer to this question, which is given from Tables I to III, seems 
to be that we require something like 50 carbon atoms (the (3, 5) structure has N = 66), and these 
must form a condensed system with at least two, and preferably three, hexagons in each 
direction. Such a structure would have an area of not less than 100 square Angstroms. 

5. Inclusion of the Overlap Integral 
In this section we reconsider one of the approximations that was previously made in section 3. 
We assumed in (5) that the overlap integral S between the atomic orbitals of two adjacent 
carbon atoms was negligible. Actually its value is about 0*25. In view of this it behoves us 
to examine the validity of arbitrarily neglecting it. The matter has been discussed for simpler 
molecules by Wheland (1942), who showed that no appreciable error was involved in this 
simplification. We shall adopt a somewhat similar procedure here. 

We are to include the overlap between neighbouring orbitals, giving all such integrals the 
same value S. If we include S, the fundamental secular equations'(6) become 

^23(^0 — E) + (<r 13 + c 2 z + — ES) = o, 

^isC^o -E) + (*}& +*23)$ -ES) =0. (13) 

Let us put E 0 - E = €, as in (7), and also 

>-( Ed-E)/QS-ES). (14) 



345 


Graphite Crystals and Crystallites 


Then the fundamental equations (13) become exactly the same as (8) except that the variable 
is now y and not x. Our previous analysis may be applied just as before, but the roots of the 
secular determinant give us (E 0 -E)/(/ 3 - ES) and not (E 0 - E)// 3 . We may, however, transform 
(14) by putting 

£-E 0 S=y. (iS) 


Then 


E 0 — E = e = 


1 - S y 


(16) 


According to (16), the binding energy of each orbital is found in terms of the new integral y 
which replaces the former / 3 . It is a simple matter to convert all the previous energy levels 
into the revised form, and then to determine the total mobile energy <$ = Se. From this we 
could proceed, just as in sections 3, 4, to draw up tables of <S/N y and <S/2By. But we shall 
only reproduce a table of <£/Ny, as our object is merely to confirm that the inclusion of S makes 
no more difference to the order of energies than with the simpler molecules discussed by 
Wheland. 

It may perhaps be mentioned that the inclusion of S introduces a considerable asymmetry 
into the energy distribution—as is shown for infinite strips in paper II—although it will 
be seen from (16) that previously bonding orbitals remain bonding, and anti-bonding 
ones remain anti-bonding. We hope to discuss this asymmetry in another place. In our 
notation y is a negative constant, so that we obtain bonding orbitals by taking all the negative 
values of y\ i.e. all the negative roots of the former secular determinant. It will also be 
recognised that apart from a normalising constant, the coefficients c rs in the final molecular 
orbitals are unaffected by the inclusion of S. 

Our calculations have been carried through, where definite numerical values are required, 
under the hypothesis that S = i/4 and y — -40 KCals. Wheland (1942) suggests y= -38 
KCals, and Mulliken, Rieke and Brown take a somewhat larger value. A really strict theory 
would need to take account of the individual variations of S and y for the links of different 
length. But such a procedure is not justified for our present purposes, and it would certainly 
not change the nature of our conclusions. These are: 

(a) An increase of either m or n increases the binding energy per atom, so that the 
perspective diagram of <£/Ny closely resembles that shown in fig. 2 for &/ Nj 3 . 1 Table IV 
shows the actual values of <§/Ny. 


Table IV.—Values of <S/Ny. Binding Energy per Atom with Account Taken 

of the Overlap S 


\n 1 

2 

3 

4 

5 

6 

7 

8 

15 

19 

CO 

m 

1 

0-9778 

0*9864 

0*9861 

0*9856 

0*9849 

0-9843 

0-9839 

0-9836 

0*9825 

0*9822 

0*9812 

2 

0-9876 

1*0027 

1*0083 

1*0113 

1*0269 

r-oisg 

1*0171 




1*0254 

3 

0*9912 

1*0093 

1*0177 


£•0289 

1*0312 




1*0434 

4 

1 0*9930 



1*0299 


1-0367 






5 

0*9941 


1 *0267 









6 

0*9948 

1*0165 










7 

o *9953 








1*0589 



8 

o *9955 


1*0321 




1*0507 




00 

0*9984 

1*0242 

1*0430 










( 3 ) There is only one exception to the rule (a), notably the polyacene series with m — 1. 
Here, for n greater than 2, <S/Ny actually decreases very slightly with increase of n. This 
predicted decrease, which is less than J per cent., makes no difference to our final conclusion 
(c), though it does leave undecided the relative stabilities per atom (not per bond) of the 
'tt electrons in the polyacene and polyphenyl series. 

(c) The greatest stability for a given total number N of atoms occurs in the approximately 
square configuration. We give, below, the resonance energies per atom for the three structures 
mentioned in section 4, and for which N = 66. 



Mary Bradburn , C. A* Coulson and G. S . Rushbrooke 

Polyphenyl 

Polyacene 

Square 

(xi, 1) 

(i, 16) 

(3, s) 

Resonance energy 



per atom , 0*1967 

0*1837 

0*227 y 


The preference for a square shape is again shown clearly; in fact the (3, 5 ) structure is the 
most stable of the three by about 14 per cent. In the earlier approximation this figure was 
17 per cent. N um erically these two extra stabilities amount to 1 * 24 and 1*50 KCals respectively. 

All this shows that the effect of introducing the overlap S is not to make any really significant 
change in our earlier conclusions. 


6 . Stability with Respect to Hydrogen 

We have already discussed, in section 4 (e), the effect of resonance energy on the shape of a 
crystallite layer, showing that this makes for a square rather than an oblong crystallite. But, 
as we then emphasised, resonance energy is not the only factor influencing this shape. The 
total energy of a crystallite comprises three terms: 

(a) The energy of the underlying carbon structure of single carbon—carbon bonds. 

\b) The energy of the mobile ^-electrons (effectively, the resonance energy which we have 
already considered). 

(c) The boundary energy (cf. section 4 (*?)). 

We have still to include the contributions from (a) and (c). 

The energy contribution (c) arises from the necessity of our saturating the free valencies of 
those carbon atoms, on the boundaries of the crystallite, which have only two carbon neighbours. 
The energy contribution of a free valency of this kind has never been calculated, and it is not 
our present purpose to do so. We shall rather assume that the whole carbon crystallite is 
edged with hydrogen atoms. We thus confine our attention to pure hydrocarbons. This 
enables us most simply to illustrate the effect of the energy contribution (c). We shall further, 
for definiteness, assume that the hydrogen atoms round the boundaries of the crystallite are 
drawn from a sufficient supply of diatomic hydrogen. Then, since there are 6 nm + m~n 
C—C a-bonds, and 43 m + 2 n C—H cr-bonds, the combined effect of the energies (a) and ( c) is 
given by 

( 6 nm + m- n )&+ 2 »)(<£o-h “ 

where <S C -c denotes the energy of a C—C single (a)-bond, etc. Now, for a given number of 
carbon atoms, it is easily shown that this part of the energy is smallest for a square crystallite, 
provided that 2<£ 0 _ h ><£q_ 0 +<§h-e> which is almost certainly true and will, indeed, be the case 
for almost all conceivable edge atoms. In other words, the energy contributions {a) and ( c ) 
act in opposition to the energy contribution ( b ) in that while resonance energy makes for square, 
disc-like crystallites, the energy contributions (a) and (c) together favour long, oblong structures. 
In fact we shall find that, in our particular field of pure hydrocarbons, the effect of the energies 
(a) and (c) is as important as the resonance energy (b). 

The three structures, viz. (11, 1), (1, 16) and (3, 5), whose resonance energies we have 
already compared in section 4 ( e), provide a good illustration. Following Pauling (1939), we 
shall take the energies <£o_o> <®c-h and < 2 h-h as 58*6, 87*3 and 103*4 KCals respectively. As is 
usual in these problems, any other self-consistent set of alternatives to these would give 
equivalent results. Then, with /?= 19*5 KCals (Wheland, 1934): 

(i) Polyphenyl (11, 1) . C—C bonds, 76 x 58*6 = 4453*6 

C—H bonds, 46 x 87*3 = 4015*8 

^-electrons, 66 x 19*5 x 1*392 = 1791*5 


10260*9 



Graphite Crystals and Crystallites 347 

(ii) Polyacene (1, 16) * C—C bonds, 81 X58-6 = 4746-6 

C—H bonds, 36 x 87-3 = 3142-8 

7 r-electrons, 66 x 19*5 x 1-398 = 1799-2 

9688-6 

5H 2 = 12x103-4 = 517*0 


10205*6 

(iii) Square (3, 5) . * C—C bonds, 88 x 58*6 = 5156-8 

C—H bonds, 22 x 87-3 = 1920-6 

7 r-electrons, 66 x 19*5 x 1-473 — 1895*8 


8973-2 

X2H 2 = 12 X 103-4 = 1240-8 


10214-O 


This shows that the “square” is now less stable than the polyphenyl. This conclusion is 
reinforced if we take account of the overlap integral (section 5). Taking 7 = 38 KCals, the 
three energies for comparison become 10967, 10871 and 10894 KCals respectively. This is 
satisfactory in that the polyphenyl energy is now sufficiently greater than that of the other two 
for us to be confident that the relative order will not be affected by the effects of steric hindrance 
between the boundary hydrogens in the former case, and in agreement with the experimental 
experience that polyphenyls are more readily prepared than polyacenes. We must stress, 
however, that the calculated energies are relative energies only; we have not attempted to give 
a numerical value to the constant energy contribution 66 E 0 . Further, although the figures 
shown relate to a particular choice of the resonance integrals /? and y, small alterations in the 
values of these two quantities would not affect the conclusions that we have just made. 

There remains, however, the wider question of whether, on the basis of energy considera¬ 
tions alone, we should expect small units to join together to form larger ones. We have made 
a number of calculations in this field, but the results are too inconclusive to be given in any 
great detail. Two examples must suffice. 

(i) The energies of two polyphenyls #2=4, n- 1 amount (without overlap) to 7600 KCals, 

while that of the single polyphenyl n — 1, together with a hydrogen molecule, is 7596 

KCals; suggesting that polymerisation will not take place. Including the overlap integral, 
the two energies for comparison become 8118 and 8110 KCals respectively. 

(ii) The energies of two polyacenes m — 1, n =4 amount (without overlap) to 5529 KCals, while 
that of the polyacene m — 1, n = S, together with the ethylene molecule C 2 H 4 which we can 
form from the unused carbon and hydrogen atoms, amounts to 5522 KCals; again suggesting, 
though not conclusively, that polymerisation will not take place. Including the overlap 
integral (and taking 7 = 38 KCals), the energies for comparison become 5905 and 5869 KCals, 
so that our former tentative conclusion receives stronger support. 

The results of many such calculations, involving more elaborate structures than those we 
have here considered, lead us to the following observations:— 

(a) The energies of the various sets of hydrocarbons which it is possible to form from a 

given number of hydrogen and carbon atoms are surprisingly close together, and to 
the accuracy of our present knowledge of j 3 and 7 it is barely possible to distinguish 
between them. 

(b) In no case examined do the energy considerations suggest that simple structures will 

combine to form large laminar crystallites. 

(c) The inclusion of the overlap integral strengthens, rather than weakens, the basis of 
, observation (b) above. 

Therefore, tentatively, we conclude that on the basis of energy considerations alone no such 
building-up process will occur at low temperatures. 

But we must stress that our conclusion, if valid, applies strictly only at the absolute zero 



248 Mary Bradburn , C. A. Coulson and G, S* Rushbrooke 

of temperature. We have not taken account of energy contributions from excited electronic 
states, nor of the energies and entropies associated with the vibrations, etc. of the hydrocarbon 
structures. Particularly in the case of the zero-point energy this may be important in deciding 
the relative stabilities of small and large crystallites. On the other hand, experimental support 
may be lent to our conclusion from the observation (Blayden, Gibson and Riley, 1943) that it is 
only at relatively high temperatures that hydrogen is given off from coals, and larger graphitic 
crystallites begin to form. 

7. Conclusion 

The methods and techniques illustrated in this paper may be applied to other structures, 
in particular to three-dimensional crystallites as well as two-dimensional ones. The writers 
are considering the problem of a system of atoms in a cubic lattice, the atoms being supposed 
to be in estates. In the case of insulators, where the whole of the conduction band is filled, 
there are considerable divergences from symmetry in the density distribution of the levels. 

It is possible to treat other shapes of graphite layer in the same way. For example, we 
may consider the energies of layer planes having the shape of a lozenge, or parallelogram, 
with angle 6o°. We have not, however, thought it worth while to report such calculations here, 
for they would only confirm the general conclusions already reached with the rectangular 
arrays. 

Taylor, Eyring and Sherman have stressed the fact that many surface films of atoms, 
containing one or two layers of adsorbed atoms, have markedly different properties from those 
associated with a three-dimensional structure. It should be possible to consider such systems 
along the lines of this present paper. 

In conclusion, our thanks are due to the British Iron and Steel Research Association for 
providing a calculating machine with which parts of these calculations were performed. 

Summary 

Calculations are made of the resonance energy, bond order and bond length in a series of 
graphitic layers of varying size. Carbon-carbon bond lengths appear to vary very little in size 
with increasing number of carbon atoms, in agreement with experiment. But variations in 
resonance energy are significant, and indicate clearly that resonance, by itself, favours an 
approximately square, rather than oblong, shape. But in the case of such layers in equilibrium 
in the presence of molecular hydrogen, the most stable layer containing a given number of 
carbon atoms is of the long, thin polyphenyl type. Some tentative calculations suggest that 
polymerisation of smaller groups to larger ones should be endothermic, in agreement with the 
experimental fact that the formation of larger graphitic crystallites during carbonisation occurs, 
with emission of hydrogen, only at high temperatures. 


REFERENCES TO LITERATURE 

Blayden, H. E. } Gibson, J., and Riley, H. L., 1943. Proceedings of a Conference on the Ultra-fine 
Structure of Coals and Cokes , British Coal Utilisation Research Association. 

Coulson, C. A., 1939. “ Bonds of Fractional Order”, Proc. Roy. Soc., A, CLXIX, 413. 

-, 1941. “Quantum Theory of the Chemical Bond”, Proc. Roy. Soc. Edin A, LXI, 115. 

-, 1944. “Structure of Coronene”, Nature , CLIV, 797. 

Coulson, C. A., and Rushbrooke, G. S., 1940. “Note on the Method of Molecular Orbitals”, 
Proc. Camb. Phil. Soc., XXXVI, 193. 

De Boer, J. H,, 1940. “Atomic Distances in Small Graphite Crystals”, Receuil des Travaux Chim. 
des Pays-Bas , LIX, 826. 

Dewey, P. H., and Harper, D. R., 1938. “Heats of Combustion of Anthracite Cokes '\Journ. Nat. 
Bur. Standards , XXI, 457. 

Herzberg, G., 1937. “Heat of Sublimation of Graphite”, Chem. Rev., xx, 145. 

Jessup, R. S., 1938. “ Heats of Combustion of Diamond and Graphite J ’, Journ. Nat . Bur. Standards , 
xxi, 475. 



Graphite Crystals and Crystallites 349 

LENNARD- J ONES, J. E., and COULSON, C. A., 1939. 4C Structure and Energy of Hydrocarbon 
Molecules ”, * Trans. Faraday Soc., XXXV, 811. 

MULLIKEN, R. S., Rieke, C. A., and BROWN, W. G., 1941. “Hyperconjugation”, Journ. Amer . 
Chem. Soc., LXIII, 41. 

Nelson, J. B., and Riley, D. P., 1945- “Thermal Expansion of Graphite”, Proc. Lond. Phys. 
Soc., LVil, 477. 

PAULING, L., 1939. “Nature of the Chemical Bond”, Cornell University Press. 

Riley, H. L. } 1939. “Chemistry of Solid Carbon”, Soc. Chem. hid., lviii, 391. 

RUTHERFORD, D. E., 1947. “Some Determinants Arising in Physics and Chemistry”, Proc. Roy. 
Soc. Edin., A, LXII, 229. 

Taylor, A., 1941. “Study of Carbon by Debye-Scherrer Method”, Journ. Sci. Instr., xvm, 91. 
Taylor, H. S., Eyring, H., and Sherman, A., 1933. “Binding Energies in the Growth of Crystal 
Nuclei”, Journ. Chem. Phys., I, 68. 

WARREN, B. E., 1941. “X-ray Diffraction in Random Layer Lattices”, Phys. Rev., LIX, 693. 
Wheland, G. W., 1934. “Quantum Mechanics of Unsaturated and Aromatic Molecules”, Journ. 
Chem. Phys., II, 474. 

-, 1942. “Resonance Energies of Aromatic and Unsaturated Molecules,” Journ. Amer. Chem . 

Soc., LXIV, 902. 


{Issued separately September 22, 1948) 



( 35o ) 


XXXVI.—Graphite Crystals and Crystallites. II. Energies of Mobile Electrons in 
Crystallites Infinite in One Direction. By C. A. Coulson (Wheatstone Physics 
Laboratory, King’s College, London) and G. S. Rushbrooke (Chemistry Department, 
Leeds University). (With Three Text-figures.) 

(MS. received September 18, 1946. Read February 3, 1947 ) 
i. Introduction 

If, in the notation of paper I in this series (Bradburn, Coulson, and Rushbrooke, preceding 
paper; see especially fig. 1), the numbers m and n are both very large, then the corresponding 
crystallite becomes equivalent to a two-dimensional plane graphite layer of effectively infinite 
extent. We shall not, however, discuss this particular case at present, but confine our attention 
in this paper to those crystallites in which either m or n is large, but not both. Since by 
“large” we mean “effectively infinite”, we shall refer to such a crystallite as an infinite strip . 
The width of the strip will then be denoted by the parameter, m or n, which is still finite. 
Fig. 1 shows sections of some typical strips, and, in conjunction with fig. 1 of the preceding 
paper, will sufficiently explain the nomenclature. 



Fig. 1.—Some typical graphitic strips for which the energy bands have been calculated. 

It is not suggested that graphitic strips of this kind have any chemical interest or import¬ 
ance. It is, perhaps, rather unlikely that they would ever arise in practice. But they do have 
a certain theoretical interest, in that they provide a one-dimensional analogue of a metal 
(cf. Rijanow, 1934)- As either m or n becomes very large, the possible energy levels for the 
mobile electrons become closer and closer together, eventually forming continuous “bands” 
of energy levels very much as in the well-known theory of ordinary three-dimensional metals. 
And we shall find that some infinite strips would behave as conductors and others as insulators. 
We end with a discussion of the extent to which such behaviour can be predicted from simple 
chemical bond diagrams. 

We have confined our attention entirely to determining the energies of the mobile electrons, 
and we shall not deal either with bond lengths or total binding energies. These are of rather 
secondary importance for such hypothetical structures. 







Graphite Crystals and Crystallites 


35 r 


2. Energies of Mobile Electrons for Infinite Strips: without Overlap 

As in paper I we shall first ignore the influence of the 1 overlap integral S, and later consider 
how the results so obtained have to be modified when this is included. In I the effect of this 
overlap integral was found to be relatively unimportant so far as binding energies were 
concerned: most of the effect of S could, in fact, be absorbed into the empirically determined 
resonance integral /?. Although this still remains true of the binding energies of the infinite 
strips discussed here, the influence of the overlap integral on the shapes of the energy bands is 
pronounced, as we shall see in § 3. 

( a) n finite , m -> 00 

When n is finite and m tends to infinity, the energy per carbon atom of the mobile electrons 
can be calculated at once from equation I (10) (equation (10) of paper I). For each of the 
n values of the 2 m values of <f> k which make A = o are easily seen to be uniformly distributed 
over the range o < <f> k < mr. Consequently, from equation I (9), the limiting value of iS/Njff 
as m co is given by 


where 


Nj8 


1 2 

■+- 


272 + I 272 4 * I 


n j prr 

X ~ V (^+ z k+ zz lc COS 
j£i w Jo 

hir 


C = 2 COS 


2(72 + 1) 


(x> 


The first term on the right-hand side of (1) derives from the m roots x = 1 of the equation 
A=o (remembering that N = 2772(272 +1) and each occupied level'is doubly filled). Each 
integral in (1) can be expressed straightforwardly as a complete elliptic integral, and its 
numerical value found from the tables. In this way limiting values of <$/N /5 for the infinite 
strips 72 = 1, 2, . . . 5, 772 —>■ 00 have been computed, and are given in Table I. 


Table I 


m -> 00 

72 = 

I 

2 

3 

4 

5 

d/N/J 

I -3983 

I*46l2 

1*4960 

1-5136 

1*5240 

d/Ny 

0*9984 

I *0242 

1*0430 

.. 

* * 

n —> 00 






772 = 

I 

2 

3 

Skew Strip 

d/Np 

I *4028 

1*4826 

1-5125 

1-4373 

d/Ny 

0*9812 

1*0254 

1-0434 

1*0144 


Moreover, from (1) we can also derive the actual distribution of the individual energy 
levels. For let N(e)de denote the number of levels in the range e to e + afe. Then N(e) is the 
sum of 

(i) 772 pairs of roots e//J = ±1, 

(ii) 72 energy bands of density N ft (e), £= 1, 2, . . .72, 

each band k corresponding to the value of k in I (9). N^(e) is easily calculated if we remember 
that there are 2m levels uniformly distributed between </>*. = o and </> fc = 7 r, and that 
xdx = - z h sin <j> k d<f) k . In fact, 

X 2772 

N & (c)=±-7- — 

z h sm <p k it 

4 772 e/j8 


(2) 



352 C* A. Coulson and G . S . Rushbrooke 

When we use the ± sign, here and later, it must be understood that the choice must always be 
such as to make N non-negative. _ ^ . 

Some of these energy bands are shown in fig. 2a. But before they can be plotted they must 
be “normalised”; details of the normalisation adopted are given below (§ 3). 


(b) m finite , n —> co 

Except in the particular case m — 1, there is no simple formula for the mean energy, or the 
density of the energy levels, for a strip of the type » -> «. 

When m = 1, straightforward calculations give 

T ± K i±v ( 9+8cos ^)}’ ^ =i ’ 2 ’ ' ‘ 

so that the mean energy is given on evaluating a complete elliptic integral, as in case (a). So 
also, in this particular case, we can find the distribution of the roots analytically, obtaining 


where 


and 


N(e)= ± —o{M(jVi) + M(jk 2 )}, 

7 Tp 


MO) =y 






€ 

There is, however, no concentration of isolated roots in this case at -g = ± 1. 

For m > 1 no such analysis is possible, and we must proceed numerically. The calcula¬ 
tions are laborious, and we have dealt only with the two further cases 2 and m -3. The 
continuous variable is now z or 8 , where 2 = 2 cos 6 and 8 — bn/2(n +1). The method adopted 
was to find numerically the energies corresponding to evenly spaced values of 8 over the 
range o <8 <77/ 2. ,We actually took 0 = o, 77/16, 77/8, . . . 77/2, and for each value of 8 we 
located the values of < 5 f> for which A = o. The corresponding energies were then found, without 
trouble, from I (9). These energies lie in 2m bands, and we have located nine points on each 
band. The total energy of the mobile electrons, and hence the mean energy per atom, can 
now be found by numerical integration over each band, using Simpson’s rule or an appropriate 
eight-strip formula. The results so obtained are given in Table I. We believe that they 
are correct to all the figures given. 

Similarly the distribution, or density, of the energy levels can be found by numerical 
differentiation of each energy band. This was done graphically, and consequently the 
numerical results are not particularly accurate; but the energy bands shown in fig. 2 b will 


be sufficiently reliable for comparison purposes. 


{c) A “Skew” Strip 

We have considered also one other such infinite strip, a “skew” strip of which the cross- 
section is shown in fig. 1. The general theory of the energy levels of such a strip is not included 
in the results of I, and is given in outline in the Appendix to the present paper. The mean 
energy of a mobile electron and the distribution of the energy levels are given by analytical 
formulae similar to those above. Numerically the results are shown in Table 1 and fig. 2 b. 
The chief interest of this strip lies in a comparison of the mean energy of its mobile electrons 
with those of the strip m = i, 00 . While the number of C—C bonds (and C—H bonds 
at the edges) is the same for the two strips, the resonance energy of the mobile electrons is 
greater for the skew array. Thus the skew array, i.e. the more “ condensed ” array, is the more 
stable. As a matter of fact, the increased stability of the skew type holds for smaller as well as 
larger systems: the simplest example is that phenanthrene, with three rings in a skew con¬ 
figuration, is about 14 KCals more stable than anthracene, with three rings in a straight row 
(Wheland, 1944, p. 69). 



Graphite Crystals and Crystallites 


353 



Fig. 2 a .—Density distribution of energy levels for the infinite polyphenyl-type strips. For a. 
description of the strips see fig. i. The thick black ordinates denote concentrated levels. ^ X(#) and 
Y (y) denote the densities without and with inclusion of overlap; in each case the normalisation is such 
that the total area under the curve, including the concentrated levels, is 2. 



354 


C. A . Coulson and G . S. Rushbrooke 


IFig. 2b .—Density distribution of energy levels for the infinite polyacene-type strips. 
(For further details see legend under fig. 2 a.) 





355 


Graphite Crystals and Crystallites 

3. The Effect of the Overlap Integral 

So far we have calculated the energy levels of the mobile electrons on the simplifying 
assumption that the overlap integral S can be neglected. This assumption seems always to 
have been made hitherto in the “tight binding theory” of metals (see Mott and Jones, 
1936). It remains to investigate the validity of such an assumption and, if necessary, to modify 
our results accordingly. We shall, in fact, find that a quite considerable modification of the 
energy bands is necessary when we include the effect of the overlap integral. 

We have already shown (I, § 5) that if x is a root of the determinantal equation I (10), then 
the binding energy of a mobile electron, associated with this root x, is given by 

where 


and y is a constant given by 

In our previous sections we have neglected S and have therefore dealt with the approximation 
■€=xj 3 . Equations (3) show us how to “correct” our present energies when we do not ignore 
the overlap integral—which we shall again assume to have the value of 0-25. 

We notice first that there is no longer symmetry about the energy e =0. While the roots x, 
being unchanged by the introduction of S, still occur in positive and negative pairs, this is no 
longer true of the corresponding values of y, and consequently no longer true of the energy 
levels e. Now both fi and y are algebraically negative, and e is positive for a binding state. 
Thus the binding states correspond to negative values of x (and therefore also of y ), while 
positive values of x (and y) correspond to “anti-bonding” states. We shall refer to the 
half-band for which x (or y) is negative as the “ lower ” half of the band; then, when the system 
of mobile electrons is in its ground state, every level in the lower half-band is doubly occupied. 

We can now describe the convention adopted for the normalisation of the energy bands of 
figs. 2a, 2d. Working in terms of x or of y, we have densities which we can call X(x) and Y(y). 
The definition of X(#), for example, is that X(x)dx is the probability that the value of x lies 
between x and x + dx. The normalisation is such that when all the binding levels are filled 
with two electrons (having opposed spins) the total probability is 1. This means that for 
each half-band 

fX(x)dx = JY (y)dy = r. (4) 

In other words, if N is the number of carbon atoms in the strip, and N(e) is the distribution 
function just determined, we plot the function obtained by letting n or m tend to infinity in 
the expression N(c)/N. 

In the strips for which m tends to infinity there are concentrated levels at x=i. In the 
normalisation scheme used above this gives a concentrated function 1/(272 + 1) in such cases. 
There is no concentrated level in the strips n —>• 00. When the concentrated level exists it is 
included in the normalisation; for example, if n = 2, the curve of X(x) gives, for each half-band, 
an area 4/5 which, with the concentrated, level, makes a total of unity. 

The normalisation of the X(#) curves now follows at once either from their explicit 
equations, e.g. (2), or, when they are derived numerically, on the basis of (4). The trans¬ 
formation from X(x) to Y (y) is surprisingly easily obtained by combining (3) and (4). It is 

Y(y)^X(x). 

In figs. 2a, 2b we show certain of these presumably more accurate energy bands for comparison 
with those calculated when the overlap integral is ignored. It is obvious at once that the 
upper (i.e, unoccupied) parts of the bands are very considerably changed, being now much 
wider than before, though even with the inclusion of the overlap integral our underlying 
assumptions in the molecular orbital method probably fail before we reach the very top of a 
band. The occupied half-bands, however, are comparatively unchanged, i.e. number of 


e=yy, 


y = 


1 -xS 
y = / 3 -E 0 S. 


( 3 ) 



356 C. A . Coulson and G. S. Rushbrooke 

peaks and existence or otherwise of an energy gap between the occupied and unoccupied 
halves. Further, the ordinate € = o still divides the total area below the X and Y curves into 
two equal parts, so that in each case an exactly half-filled band stretches up to this value. 

We must next ask whether the less severe distortions of the occupied half-bands significantly 
affect the mean binding energies of the mobile electrons. In those cases in which we have an 
explicit algebraic expression for the allowed values of # (e.g. equation (i)) we can of course 
use (3) to find the “corrected” mean energy. But this leads to a third order elliptic integral, 
and it has seemed more straightforward in every case to “ correct ” the energy levels at regularly 
spaced values of the continuous parameter <j> or 8 , and then perform a numerical integration, 
as in case (b) above, to find the mean energy. The results of such calculations are in Table I. 
We see that the mean value of y is in an almost constant ratio to the mean value of x\ so that 
inclusion of the overlap integral has no effect on the relative stabilities of the structures. A 
similar conclusion, it will be remembered, was reached in I for large, but not infinitely large, 
systems; and, for still smaller systems, by Wheland (1942). 

An inspection of the density curves in fig. 2 shows that in all cases there are one or more 
infinite values of the ordinate, though of course the total area below the curve is necessarily 
finite. At first sight this may appear somewhat surprising; but there are three comments 
that we should like to make with regard to it. In the first place, a similar infinity occurs in 
the density curve for a body-centred cubic crystal (Mott and Jones, 1936, p. 85) and for a linear 
chain of atoms in 5* states (unpublished work). In the second place, any distortion or 
irregularity in the crystal, as will necessarily occur in real crystals, will destroy the infinity 
and replace it by a large peak. And finally, in the more accurate Wigner-Seitz calculations 
on face-centred iron by Greene and Manning (1943; see fig. 4 on p. 207), the density function 
shows four reasonably sharp peaks, two of which are extremely narrow. Indeed their curve 
quite closely resembles some of our curves in its general pattern. Evidently the existence of 
sharp maxima in the density curve is a not uncommon feature of periodic lattices. 


4. Division of the Strips into “Conductors” and “Insulators” 


The energy bands, the densities of whose levels are shown in fig. 2, divide into two distinct 
classes. In the case of strips for which m is finite and ?i tends to infinity (polyacene type) 
there is no energy gap between the occupied and unoccupied states; so we may refer to such 
strips as “conductors”. On the other hand, for the skew array and for most of the strips of 
polyphenyl type, when m tends to infinity, there is a finite energy gap between the occupied 
and the unoccupied levels of the mobile electrons. By analogy with three-dimensional metals, 
we must call such strips “insulators This gap decreases as m increases, so that the infinite 
graphite plane will be conducting in both directions in the plane. 

The exceptional strips of polyphenyl type occur when 2, 5, 8 . . ., and, rather un¬ 
expectedly, will behave as conductors. Equation (2) shows at once how strips of width 
are exceptional, compared with strips of width 3/ or 3/ +1. For (2) shows that 
the lower half of the band k extends from e/j8=-(i +z k ) to e//3= - (1 -z k ), where 


klT 


-, and the allowed values of k are k—i, 2, 


n. Thus the energy gap 


Zt» — 2 cos . v, 

10 2 (n +1)’ 

between the two halves of the band k is — 2^8(1 z k ), and it vanishes if, and only if, z k is allowed 
to have the value 1. This implies that n is of the form n-sp- 1. 

The conductive behaviour of these strips is unexpected in that it does not follow, in the way 
that the conductive or non-conductive properties of the other strips do, from simple considera¬ 
tions based on possible alternative canonical bond diagrams, involving double and single 
bonds. For all strips of the polyacene type we can write down alternative canonical structures 
resonance between which will provide long circuits for the mobile electrons, extending the 
whole length of the strips. Indeed it is possible to find a Kekule-type structure in which any 
chosen bond is double. It might therefore be surmised that resonance among all these 
structures would allow the migration of electrons over the whole system. Such a situation 
does not obtain, as trial will soon show, for any strip of the polyphenyl type 00, where 
there are certain bonds which always appear as single bonds in all Kekule structures that can 
be drawn. At first sight, then, it is a little surprising that certain of these strips, viz. those 



Graphite Crystals and Crystallites 357 

for which n = 2, 5, 8, . . ., should behave as conductors. It would take us too far afield to 
discuss this matter in detail; suffice it to indicate here an essential limitation of any argument 
based solely on the existence of alternative unexcited canonical structures. 

Summary 

The shapes of the energy bands have been determined for the mobile electrons in graphitic 
strips of infinite length but finite width, using as a basis the approximation of tight binding 
discussed for finite crystal layers in the previous paper. It appears that the effect of including 
overlap between the orbitals of adjacent atoms, whose incorporation in this type of calculation 
has hitherto been neglected, is to widen the top half of the band by a factor of the order of 
2 or 3, the lower half of the band not being greatly affected. Some of these strips may be 
classed as conductors, the others as insulators; but the distinction between the two may not 
be made on the basis of any simple chemical bond diagrams. 


REFERENCES TO LITERATURE 

GREENE, J. B., and Manning, M. F., 1943. “ Electronic Energy Bands in Face-centred Iron”, Phys. 

Rev., LXIII, 207. 

MOTT, N. F., and JONES, H., 1936. Theory of the Properties of Metals and Alloys, Oxford University 
Press. 

RljANOW, S., 1934. “Energy Bands in Thin Strips”, Zeits.f. Physik , LXXXIX, 806. 

Rutherford, D. E., 1947. “Some Determinants Arising in Physics and Chemistry”, Proc. Roy. 
Soc. Edin ., A, LXII, 229. 

Wheland, G. W., 1942. “Resonance Energies in .Unsaturated and Conjugated Molecules '\Journ. 

Amer . Chem . Soc., LXIV, 902. 

-, 1944. The Theory of Reso?iance, Wiley, p. 69. 


APPENDIX 

Roots of the Secular Equation for the Infinite Skew Strip 

Fig. 3 ( a ) shows a finite skew array. Its secular determinant, however, is very difficult 
to deal with in general terms owing to the different number of centres in successive rows of 
the array (in this case 6, 7, 7, 6). But the equations become manageable if we add two more 
terminal centres, as in fig. 3 (b), so that there are now the same number in every row. This 



Fig. 3. — (a) An infinite skew strip. 

(b) The skew strip, .with numbering system, for which calculations were made. 

will appreciably affect the roots of the secular determinant for any finite array, but as the 
number of hexagons in the strip increases, the modification of the roots due to the two extra 
terminal centres will decrease in importance and eventually cease to have any significance what¬ 
ever (Ledermann, Proc . Roy. Soc. London , A, clxxxii, 362,1944). Thus the roots of the secular 
determinant for an infinite strip of type 3 (a) are indistinguishable from those of an infinite 
strip of type 3 (b), and to find the former we can develop the theory for strips of the latter type. 
This theory is somewhat similar to the analysis developed by Rutherford (loc. cit.) when 
evaluating the determinants which occurred in I. It is not, however, included in Rutherford’s 
work, and seems worth reporting separately* 



358 


C. A • Cout son and G. S . Rushbrooke 


Numbering the centres as in fig. 3 ( b ) we obtain (*rf. paper I) the secular determinant 



where I is the unit matrix having n rows and columns, and A and B are the matrices 



Here n is the number of centres in each row of 3 (b) and is necessarily odd. Rearranging, we 
obtain 



which factorises (Rutherford, loc . cit.) into 



Consequently to find the roots (^-values) of A =0, we 

(i) write down the values of z k =2 co§ to/$, to find z x and z 2 ; 

(ii) for each z k solve z h sin (n + r + sin nfa ? = o, i = r, 2 andr-i, 2, . , „ n to 

give us the values of fa r (o < fa r t <tt); 

(iii) evaluate x from x = ± V( I + % 2 + 2% cps fa r }. 



359 


Graphite Crystals and Crystallites 

We thus find 2 n pairs of roots. Of each corresponding pair of energy levels one is doubly 
filled and the other unoccupied. To find the mean energy of the mobile electrons for the 
infinite strip we therefore require 

1 f n n 

Lim —( y\ \l{i + z^ + 2 z ± cos ^ r } + y\ V{ 1 + cos <£2, r} 

Since, as n-> », the values of become uniformly spaced over the range o < <f>^ r < rr, we 
can write the limit as 

x = «\/{i + % 2 4- 2 z L cos 7rx}dx + \/{i + 0 2 2 + 20 2 cos 7 Tx}dx. 


On simple transformation this leads to 

- 2 + V5j 
*—~ E l a ’ 2 } 

where 

sin 2 a=4(V5 -2), 

and E^a, is a complete elliptic integral. 


(.Issued separately September 22, 1948) 



( 360 ) 


XXXVII.— The Van der Waals Force between a Proton and a Hydrogen Atom. 

II. Excited States. By C. A. Coulson (King’s College, London) and Miss C. M. 

Gillam (University College, Dundee). (With One Text-figure.) 

(MS. received October 2 4, 1946. Read February 3, 1947 ) 

1. Introduction 

In a previous paper (Coulson, 1941, referred to hereafter as I) a calculation was made of the 
energy of interaction between a bare proton and a hydrogen atom in its ground state. In 
particular it was shown that if the separation of the two nuclei was R, this interaction energy, 
which gives rise to the Van der Waals force, could be expressed in the form 

E(R) = ^E 1 +^ i E a + ... (x) 

and the coefficients E 1? E 2 , . . . could be determined absolutely. The calculation was carried 
through as far as the term 1/R 10 . 

But this analysis only applied to the hydrogen atom in its ground state. And the problem 
has recently acquired a new importance in astrophysics; for one explanation of the broadening 
of certain lines in stellar spectra is that hydrogen atoms, which are very abundant in these 
atmospheres, are being perturbed, at the moment of emission or absorption, by the presence of 
neighbouring protons. It becomes desirable, therefore, to calculate the interaction energy 
between a proton and a hydrogen atom, not only when the latter is in its ground state (paper I), 
but also when it is in a general excited state. It is the purpose of the present paper to make this 
calculation. 

A partial solution, to which we shall refer later, has been given by Mrs Krogdahl (1944). 
It must suffice for the moment to say that although Mrs Krogdahl found the first three non¬ 
vanishing coefficients in (1) for several excited states of the hydrogen atom, her method was 
not such as to give a general formula applicable to all states. We shall provide such a formula, 
which may be used to determine the first four non-vanishing terms in (1), i.e. up to and including 
E 5 /R 5 , for any initial state of the hydrogen atom. 

2. Choice of Co-ordinates 

We use the notation of I, which may be seen from fig. 1. P is the electron with polar 
co-ordinates r, 0 , <f> relative to its nucleus A, B is the disturbing proton and AB is the polar 


p 



A R B 

Fig. 1. 


axis. Then if H is the Hamiltonian for the isolated hydrogen atom around A, so that (in 
atomic units) 

H=-V-;, (2) 

2 r 

the problem consists in solving the equation 

(H - E + Vp er t)^ = o, 

v = — -L. 

Vpert = R -y 


where E is the energy, and 


( 3 > 

( 4 > 



361 


The Van der Waals Force between a Proton and a Hydrogen Atom 


As in I, we expand V pevt in powers of i/R, and we write 


E = E° + “E i +— 2 E 2 + . . . , 

. ( 5 ) 

so that the interaction energy is 


E(R) = E - E 0 . 

(6) 

Next we expand 



( 7 ) 


where if ; 0 is the wave function of the unperturbed hydrogen atom in any one of its allowed 
quantum states. 

Substituting all the expansions in (3) and equating to zero the coefficients of all the various 


powers of i/R, we obtain the set of equations ((5) in I): 

(H - E 0 )i/r 0 = o, (80) 

(H-Eo^E^o, ( 8 b) 

(H - E 0 )^ a - rP x (cos 6)tf/ Q = E^ + E a j/r 0 , (8c) 

(H - E 0 )t/f $ - r 2 P 2 (cos 6)iff 0 - rPj (cos 8 )\fs x - E x ^ 2 + E^ + E 3 i/r 0 > (8d) 

etc. 


The (?2 + i)th of these equations enables us to find ifj n if we have previously calculated 
ip 0i \js x . . . i/r n _!. The condition that is everywhere finite enables us to determine E n . The 
situation so far is very similar to that in I. But it differs quite fundamentally in the following 
respect. 

The first equation (8a) shows that i/j 0 is the wave function and E 0 is the energy of the 
unperturbed hydrogen atom. But there is degeneracy, for if the total quantum number is n , 
so that E 0 — - 1 / 2 n 2 , there are no less than n 2 wave functions ifj 0 . In I, where n — 1 , there was 
no degeneracy and therefore no difficulty of this kind. But now there is a difficulty, for the 
perturbation will remove the degeneracy and its nature will determine the correct zero-order 
combinations of the ri 2 degenerate wave functions. It is easy to see that the symmetry of the 
problem around AB implies that the magnetic quantum number m is still an exact quantum 
number, so that we need only consider at one time linear combinations for which m has the 
same value. This reduces the difficulty, but still leaves a degeneracy of n -1 m |. If n is not 
large, it is possible to set up the usual secular determinant, using the fatniliar r , 8 , <j> wave 
functions, and in this way we can fix the appropriate zero-order combinations. This indeed is 
the technique of Mrs Krogdahl, but it is open to two objections. On the one hand it makes 
the calculations rather cumbersome, since j/r 0 , which repeatedly occurs in (8), is a linear 
combination of n - 1 m | terms with rather untidy coefficients (see Krogdahl, Table I). For 
this reason Krogdahl only considered values of n-\m\ lying between 1 and 5, and, except 
for n -1 m \ = 1, she only calculated as far as E 3 : in the latter case E 4 was also determined. The 
second objection is that, by the nature of the calculation, no general formula valid for all n and 
m could be obtained. 

The problem is made much simpler by recognising that the first term in V pert (see eq. (4)) 

r z , 

is (cos 6 ), z.<?. This represents the interaction between the hydrogen atom and a 

uniform electric field. Indeed it will be recognized that (8a) and (8c) are identical with the 
equations for the Stark effect of a hydrogen atom, so that the correct zero-order wave functions 
in our present problem, being determined by the first terms of the perturbation, are identical 
with those encountered in the Stark effect. In this latter problem, as Epstein (1926), 
Schrodinger (1926) and others have shown, great simplification results from working in para¬ 
bolic co-ordinates g~r + z, rj—r-z, and <j> instead of polar co-ordinates r, 0 , <f>. The wave 
equation is separable in these new co-ordinates, and—what is important for us—these solutions 
are themselves already the correct ones for use in the ordinary perturbation treatment. For 
they represent the correct linear combinations of the original n-\m\ degenerate wave 
functions. The same must also be true for our problem, and we shall henceforth deal with the 
set of equations (8) entirely on this basis, giving xf ; 0 any one of the values found in parabolic 
co-ordinates, and paying no further attention to the question of degeneracy. 



362 


C. A . Coulson and C. M. Gillam 


3. Method of Calculation 

According to the reasoning above we are to solve the set of equations (8) in which 


and 

1 « 

H | §2 

1 

li 

Pf 

(9) 

where 

-1 ^ |m| 

( 10 ) 


** 2 L* + U|(*> 

M) (k + \m\)\ 

(n) 


(see Schrodinger, 1926). 

The parameters n } k l9 k 2 , which may be regarded as the three defining quantum numbers, 
are necessarily positive integers, and are related to the magnetic quantum number m by the 
equation 

^1+^2 + 1 m\ + i—n. (12) 


In practice we have found it easier to work in terms of and n rather than any other 
combination (e.g. n, - & 2 ) which involves m directly. 

We shall find it convenient to introduce the notation 


{r, s} -A+r ( )/*,+* ( n )> 


(13) 


so that (10) may be written in the form 

<Ao={°, (14) 

Our earlier comment about m being an exact quantum number implies that each function 
i{s p in (8) is of the form 


in which X p is a function of £ and 77 only. If we put 


V 

n 


(iS) 

(i6> 


the equation which X p must satisfy is easily found to be 


ji x \ 


8\ d 
+ 


dxj dy\ dy 


m z f'i i\ x+y 

-(- + -- + n 

4\xyJ 4 


‘ X p «-- 


-(x+y)\ rP - 1 P p «!(cos 8 )X 0 


-t-rP ~ 2 Pp_2(cos 0 )Xi+ . . . + rS-y (cos 6 ) X p _ 2 + + E 2 X p _ 2 + . . . +E P X 0 . 


(i 7 ) 


Let the differential operator on the left be denoted by L. Thus we require to solve the 
set of equations 


L(X 1 )=--(x+^)E 1 X 0 , 


(i&a) 


L(X*)= --(ac+j^lVP^cos e)X 0 + -E 1 X 1 + E 2 X 0 ], 


L(X 3 ) = --(«+^)[r 2 P 2 (cos ^Xo + ^PiCcos 0 )X 1 + E 1 X 2 + E 2 X 1 + E 3 X o ] ) 

etc., 

and we are given that 

X 0 ={o J o}. 


(i8c) 


(i9> 



The Van der Waals Force between a Proton and a Hydrogen Atom 363 

The importance of the functions {r, s} introduced in (13) lies in the differential equation which 
they satisfy. Thus, since the equation 

, / i-\m\ x m 2 \ 

(xv) +[■ --- - j» + (| m |+^ + 0» = o 

is satisfied (Courant and Hilbert, p. 285) by 

V=A+r(x), 

it follows from the definitions of {r, s} and from (12) that 

L {r, s}~ — (f +r){r, s}. ( 20 ) 

This simple result suggests that we should seek the solution of our fundamental equations (18) 
in terms of the functions {r } s }, which form a complete orthogonal set. In order to do this 
we require to know the expansion of x p {r, s} in terms of the members of this set. It is 
shown in the Appendix that we may write 

v 

x»{r, s}= 2 a t {r + i, s}, 
where i= ~ v 


. (k x + r + i )! t — _ <j> + \m l+^+r-cjl _ 

1 (h x + r + z + \ m |)r ~*(k x + r-c)\ c 1 ( p-c)\ (c + i )! (p-c-i)\ 


( 21 ) 


Similarly p 

y v {r, s}- 2 s+ *}> 

<= -v 

where b { is obtained from a t by replacing k ± by h 2 and r by s. The summation in (21) is over 
all integral values of c which make every one of the factorials non-negative. This formula is 
only required for fairly small values of p, and it is convenient, in order to avoid confusion, to 
introduce the following notation:— 

x{r, 2 Af* +f {r + i, s}, 

»BS - 1 

is-1 

x 2 {r, j>= 2 B i 1 +r i r + *> J )> 

i= -2 

s}= 2 Ci 1+r {^ + 2, j}, (22) 

isss — 3 

etc. 

The numerical values of the constants A*, B* . . . are found as particular examples of the 
general formula (21). 

We are now in a position to solve the equations (18) one by one. Consider first (180). By 
using the formulae for x{r y s} and y{r, it may be written 

L(Xj) = E x 2 [A* 1 !*, o} + A**{o, *}]. 

2 i— ”1 

So let us put 

Xi = 2 a r,,(r, s}. ( 23 ) 

r,s 

Then according to (20) we get the equation 

2 (r+s)a Tt ,{r, j} = 7Ei 2 oJ+Af’K *}]. 

r, s 2 i= -1 


(24) 



364 


C. A. Coulson and C. M. Gillarn 


Since the functions {r, s} are an orthogonal set we may equate coefficients of {r, j} on both 
sides. In particular, the coefficient of {o, 0} on the left-hand side is zero, while that on the 
right is 


-E^Aj+Afl-rfEv 


Therefore 

Equation (24) now reduces to 


Ei = o. 


(25) 


'Z J (r+s)a r>s {r, r}=o. 

r,j 


This shows that every a T # s is zero except possibly those for which r + s~ o, which are at present 
indeterminate. We can take a 0j 0 = o, as this only affects the normalisation (see I). 

Instead of (23) we can now write 

x i=2 (26) 

5 

where the a f) ^ will have to be determined later (from a detailed study of X 3 : see equation (34)). 
We can now proceed to equation (i 83 ). Thus, to calculate X 2 and E 2 , we put 

X a = 2,4,«fc 4 with 4,o = °, ( 27 ) 

r>s 

in this equation. It becomes 


Now 


n 2 

2 ( r+J ) 4 , < {r, 4 -~(* +y) [rP x (cos 0 )X 0 + E 2 X 0 ]. 

r % s 2 

ft 

rPj (cos 9 ) = z = ~(x ~y). 


(28) 


Thus 


(x +y)rP l (cos 0 )X o =-( x 2 —jv 2 ){o, 0} 
2 


and 


=7 2 o}-b‘*{o, *•}], 

2 i= -2 

(x + jy)E 2 X 0 = E 2 (* + y){o, 0} 

= E 2 2 °}+Aj*{o, *}]. 


Substituting these results in (28) and equating the coefficients of {o, 0} on either side of the 
equation gives 


But 
and 
so that 


o=jB?-B 0 1 + E^ + ^]. 


B*-B* 


Af + Al* = m, 


- 2 (^i ~^a)* (29) 

By considering the coefficients of the remaining {r } s} which occur on the right-hand side of 
(28) we obtain the results 



36 5 


The Van der Waah Force between a Proton and a Hydrogen Atom 


II 

0 

J? 

_ «V‘ 

g B -2> 

H-* 

O 

II 

- —- —E 2 A 4 ‘. 
4 2 

^2, 0 = 

7 ®*’ 

K, 0 ~ 


*0, -2 “ 

« 3 

gB.j, 

^0, -1 = 

"~E 2 A* 
4 2 

2“ 

“I B *» 

K 1” 

Bj a + —E 2 aJ*. 
4 2 


( 3 °) 


The remaining b„ , are zero except possibly those of the form 6 t , _ 4 , which are indeterminate 
at present. 

The solutions of the remaining equations (18c), (i8d), ... for X 3 , X 4 , . . . follow in a 
similar way. But there is one new factor which makes it desirable to work through the particular 
case of X 3 in rather more detail. For each of these equations determines not only the corre¬ 
sponding energy E 3 , E 4 , . . . and most of the coefficients in the wave functions X 3 , X 4 , . . . of 
the same order, but it also completes the calculation of the wave functions whose order is two 
less than that of the equation itself. Thus the equation for X 3 completes X l9 and that for X 4 
completes X 2 , etc. This is a rather unusual situation, but we can illustrate it by the case of 
X 3 as follows. In equation (18c) put 


x 3 =]>X»{ r > 4 with ^o, 0=0, 

r, s 

obtaining 


+ s}=~(x+y)[r 2 B 2 (cos 6 )X 0 J rr? l (cos 0 )X 1 + E 2 X 1 + E 3 X o ]. 

r, t 2 

Expanding the terms on the right-hand side we get 

(x +y)r 2 P 2 (cos 9 ) X 0 = -(x s - 3x 2 y - 3xy 2 +j' 3 ){o, o} 

4 

}-si 

4 Li= -3 *= -2 j = - 1 

-3 2 220,4 

» = -2 j= -1 i = -3 J 

n 

(x +y)rP 1 (cos 9 )X 1 =-(x 2 -y 2 )]?^, _/j, -j} 

j 

= ;2 2 ** -i[B f+’V+i ,*-»]. 

2 j i= -2 


0 * +jk)E 2 X 1 = E 2 ( 3 c +y) ^ a* _#{/, -y} 

=e 2 2 2 *'-/}]» 

i <=-i 

=E s (*+^){o, o} 

=E 3 2 Wft o} + A^. 9 {o, 2 }J. 

i= -1 


( 31 ) 


( 32 ) 


(# +^)E 3 X 0 



366 C. A . Coulson and C. M. GiUam 

Using these expressions in (32) and equating coefficients of {o, 0} we find 

0 = j[Co 1 - 3 bJ 1 Ao‘-3 b o IA o 1 + C o’‘] +E 3 [Aq‘ +Aq 1 ]. 

On simplifying, this gives 

E 3 =^[^-6(^-i 2 ) 2 -i]. 


(33) 


The coefficients c Ty 8 in the expansion (31) of X 3 are similarly found from (32), except those for 
which r+s= 0. Twenty-six of them are non-zero, but only a few of these are needed in the 
calculations of E 4 and E 5 , so that it does not seem necessary to tabulate their values here* 
Terms for which r+s = 0 do not appear on the left of (32), but they do appear on the right in a 
form which involves the hitherto unknown coefficients in X x . Considering, for example, the 
coefficients of {1, -1}, we see that 


o=7[-3Bf-A^ - 3 B i ! 1 A*'] + ;[ B o' +1 - -1 +E 2 K +1 + Aj*" 1 ]^. 

A 2 


from which we obtain 


Similarly 


*1, + 

n 

0-1,1 = - “(4 + x)(« - 4 -1), 


and the remaining a iy are zero. In this way we complete the calculation of X x , which takes 
the form 

Xi =”[(4 +1)(« - 4 “ i){x, - r} - (4 +1)(« - 4 -1){ -1, 1}]. ( 3 4) 

It is rather surprising that the precise form of X x is only determined by later considerations 
involved in the discussion not of X 2 but of X 3 . 

In the differential equation for X 4 we put 

X 4 = 24 )S {^» 4 . with <4,0 = 0, ( 35 ) 

r,s 

and obtain 

X (? + s)d„ai r , •f}=~"(®+Jl')[ ? ' 3B s(cos 8)X 0 + f i P 2 (cos 0)X 1 + ^P 1 (cos 0)X 2 + E 2 X 2 

T,3 2 

+ E 3 X 1 + E 4 X 0 ]. ( 3 6 ) 

This determines E 4 and, if necessary, 4 , and d r> S {r+s¥= o). In fact 
E 4 = - —[4+ 9«(4 +4 + 1 ) - 6(4 + 4 ) -644 - 9(4 + 4 ) + 5] 

« s 

—g( 4 -£s)[- 24 * a + 9 «( 4+4 + i) + So(^i+ ^D “ 11844-9(4+ 4)+ 25]. (37) 


Likewise from the next equation for X 5 we get the final result: 

E 5 = - —{n 1 + n\ - 3 1 (k\ + 4 ) + 6444 + 4+4-5] 

+«( 4 + 4 +r)[i 6 ( 4 +i|) - 34 44 -4 -4]+35(4+4) -17244(4+4)+27644 
— ^2) +41(^1 4 "^|) 3 o ^ x ^2 "^4} 

- ~r(4 - 4)[22« 2 + 6 3 «(4 + 4+1) - 30(^1 + 4) _ 6644 - 63(4+ 4) + 65]. 


( 38 ) 



The Van der Waah Force between a Proton and a Hydrogen Atom 


367 


4. Conclusion 

Equations (25), (29), (33), (37) and (38) give the coefficients E x . . . E 5 of the desired 
expansion (1). It will be noticed that they exhibit considerable similarity in their dependence 
on k-y and k % ) and they may be regarded as the appropriate generalisation of the result in 
paper I. For some purposes it is more convenient to show the dependence of the perturbation 
energy upon n, m , and h x -h 2t This can be done by using (12) to eliminate hy+h 2 . The 
results are: 

xn 

E 2 = ( 39 a ) 

E 3 = —[« 2 - 6(/£ 1 - — 1], ( 39 t) 

2 

fl y yp 

E 4 = — -Inn* - 9 m * - 3(^1 “^2 ) 2 +19] - -7(^1 39 ^ 2 - 9 'M 2 +109^-^ 2 ) 2 + 59], (39 c) 

10 10 

E 6 = - — [345(^1 ~ h^ 4 + 6 (k x - i 2 ) 2 ( - 3 in 2 - 1 im 2 + 65) 4 - 9 n 4 + m 4 - 2n 2 m 2 - 42^ - 2m 2 + 33] 

04 

5 

- —(^1 -^ 2 )[io 7 k2 “ + 3(^1 -^a) 2 +193]- (39^) 

3 2 

It will be noticed that E 2 is the first order Stark effect, and the first term of E 4 is the second 
order Stark effect. It is an easy matter to calculate the coefficients for any assigned state of 
the hydrogen atom; indeed we have not thought it worth while to include any of these here, 
because Krogdahl gives several of them in her work previously referred to, and, so far as they 
go, her results agree with ours. 

There is only one final comment to make. In I, where n = i, it was noticed that successive 
coefficients E x , E 2 , . . . increased steadily, but slowly, up to E 10 , which was the last one 
calculated. A similar increase occurs here, but on account of the high powers of n involved 
(n 8 appears in the expression for Eg) the increase is much greater. This is an expression of 
the fact that the Van der Waals forces for an excited atom are much greater than those for an 
atom in its ground state; it implies that the original expansion (i) becomes invalid, through 
non-convergence, at much larger values of R than in I. For smaller values of R other expres¬ 
sions for the interaction energy than the expansion (i) must be employed. 


5. Summary 

The interaction energy, or Van der Waals force, between a proton and a hydrogen atom 
in any one of its allowed quantum states is calculated in terms of the intemuclear distance R 
by an expansion of the form 

e(R)=Ie 1+ Le 2+ . .. 

All the coefficients up to and including E 6 are obtained in closed form. For values of R for 
which the expansion is valid, the coefficients are determined absolutely, no approximations 
being introduced. 


APPENDIX 

To express x p {r, s} in the form ^ a t {r + i, s}. For small values of p this expansion can be 

i 

obtained by repeated application of the recurrence formula for the Laguerre polynomials 
x{r , j}= -(| m l+ki + rfr-i, r} + ( \m | +2^ + 2 r+i){r, sj-fa + r+i^r+i, r}. 



368 The Van der Waals Force between a Proton and a Hydrogen Atom 

But the general result is best obtained by the use of generating functions. Thus, let 

v 

x p {r, i}= 2 a ii r + i > - f }> 

<=- 3 > 


i.e. 

We know that 


x % 1+ r(x)= 2, «iA + r + i(*)- 


1 = -p 


f (%) fn % (pC) dx — 

JO 


o n x *£n% 

{n +1 m |) 1 


n\ 


i~n x = i 


( 4 o) 


(4i) 


Therefore, multiplying each side of (40) by f livVr . vi (x) and integrating gives 
(h 1 + r + z' + \ m |)! f°° /Njr /w 

(i. + r+O! 

gi±M Wj. 

Jo (k x +r +1 m |)l (k x + r + i +1 m |)! 

Schrodinger (1926) has shown that this integral may be evaluated by the use of the generating 
function: 

set 


^-MLI-lfr) ( |)M 


In fact, 


n\ ' 

(k x + r + z + \ m |)I 


^coefficient of fa+r u h+r+i i n 


f 

J 0 f T — 


(fii + r +f)l 

)p+W e ^[xzrt + T^\ 




0 (1 -/)l m l +1 (i -^)l w l “ t ‘ 1 


dx 


„ {p + \m\)\{i~t)%i-u)\i~ut)^^ m \^ 

=(p+\m I)l2_ (/ + M+«)!(~ 0*+'-/ 1 ( - _ 

W « I (^ +1 ^ |)! (/§! + r - n )! (J> - k x - r + 72) 1 + r + i ~~ «) l - k t - ^ - i + l 


Therefore, putting k 1 + r~~n — c, 

a, __j,./ yV (^ + |w|+^ + r-g)l 

(i 3L + r + * + |« I)r " ' c (^i + r-tf)l + 

This gives the required expansion (21). 


(42) 


REFERENCES TO LITERATURE 
COULSON, C. A., 1941. Proc. Roy. Soc. Edin A, LXI, 20. 

Courant, R., and Hilbert, D. Methoden der Mathematischen Physik, I, 2nd ed. (1931), 28^. 
Epstein, P. S., 1926. Phys. Rev., xxvm, 695. 

KROGDAHL, M. K., 1944. A strop hys . Journ. , C, 311. 

Schrodinger, E., 1926. Ann. d. Phys., lxxx, 437. 


{Issued separately September 22, 1948) 



( 3 6 9 ) 


XXXVIII.— On the Estimation of Many Statistical Parameters. By A. C. Aitken, 
D.Sc., F.R.S., Mathematical Institute, University of Edinburgh 
(MS. received November 4, 1946. Read February 3, 1947) 


1. Introductory 

In an earlier paper (Aitken and Silverstone, 1941) the problem of estimating from sample 
a parameter 9 of unknown value was treated by adopting two postulates for the estimating 
function: (i) that it should be unbiased in the linear sense; (ii) that its sampling variance 
should be minimal. Specifically, in the case of a probability function <£=<£(#; 9 ) involving 
a parameter 6 , if a sample of N independent values of x has provided the vector 


x={x x x 2 - • • »n )j 

then under certain conditions an estimating function 


exists such that 


t ^(^i> ^2? " * •} X JSf) 

$t$(x)dx = 9 , 

J(l - 9 ) 2 <$)(x)dx = minimum, 
S<&(x)dx = i, 


(i.x) 

(1.2) 


(*• 3 ) 


where <£>(#) is the probability density of the sample vector x, and dx~dx x dx 2 . * . dxy. The 
classes of function <f> amenable to the method are limited. An important sub-class is that of 
functions </> satisfying the differential equation 


ee 


log <£> = (/- 0 )/A( 0 ), 


( 14 ) 


where <E> is the product of N similar functions </>, and A ( 6 ) is independent of x. It was proved 
that the sampling variance of t(x) is then equal to A ( 8 ), and that the function <f> admitting 
estimation of 6 under the postulates (1.3) is of the type 


6 ) = exp {fix) + F{ 9 ) + t(x)y{ 9 )}, (1.5) 


known (Koopman, 1936) to be the only type admitting sufficient statistics. It was also 
proved that in many cases, where 9 itself does not admit an estimating function /, some 
function r(0) does so, where 


t(0) = - 


8 F [fry 

89 / 89 


(1.6) 


The above procedure was extended (Solomon, 1944) to the case of two parameters 9 X , 9 if 
but to avoid an impasse the second postulate had to be modified. It was now proposed to 
minimize, not the two separate sampling variances of the estimators t x (%) and / 2 (^)» but the 
determinant | V\ of their variance matrix V— [%], where 

- O s )Qdx, hj— 1, 2. (1.7) 

The determinant | V\ is in fact the generalized variance of Wilks (Wilks, 1^32). This 
approach to the problem of multiple estimation is from a direction opposite to that taken 
by R. C. Geary, who proved (Geary, 1942) that when many parameters are simultaneously 
estimated from a sample of N values by the method of maximum likelihood the generalized 
variance, for fixed but indefinitely large N , tends to a minimum in an asymptotic sense, 
while at the same time (as was first shown by R. A. Fisher, 1934) the joint distribution of 
the l { (x) tends to the correlated normal type. 



370 At* C . Ait ken 

Analogues in two parameters were found by Dr Solomon for the various theorems of 
Silverstone in one parameter, and such cases as the simultaneous estimation of mean and 
variance from a normal sample were studied in detail. The results, and their generalizations, 
will be the subject of remark in §§ 3, 5 and 6. 

A recent paper (Rao, 1945) treats the estimation of many parameters from a standpoint 
resembling the present one, but centring around the matrix 

(, - 8) 

which is fundamental in R. A. Fisher's theory (Fisher, 1921) of the amount of information'' 
latent in a sample. 

Our purpose is now to apply the postulates of unbiased estimate and minimal determinant 
of variance to the case of any number k of parameters, and to examine in detail the case of 
a multivariate normal sample. 


2. Postulates for Estimation 

For conciseness we adopt vector and matrix notation. The probability density of the vector 
variate, <j>(x v x 2 , . . x n ; 0 V 0 2 , • . 0&), is denoted by </>(x; 0 ), where 0* s {0 1 0 2 • . . 0*}; 

that of a sample of N such vectors will be denoted by < 5 (AT; 0 ), where X is the sample vector 
of Nn elements, imagined written out in N sets of n. In the cases considered < 3 > will be the 
product of N similar factors </>, so that log 0 = 2 log <£, and in fact the results for <f> and <t> 
will differ merely by some power of N. We shall assume that <j> possesses first and second 
derivatives with respect to the 0 im 

The postulates for the estimators 4* = /*(#) are now: 

(?) -0* i=*i, 2, . , ♦, k 7 

(ii) | F | = | v i3 1 = minimum, 

where 

~ dj)<$>dx y (2.1) 

(iii) JW# = i. 

We proceed to show that a class of functions O admitting the method is given by the 
solutions of 

(t - 0)0 ** A0$, (2.2) 

where t is the vector of estimating functions, < 5 $ is the vector of partial derivatives of O with 
respect to the elements of 0 , and A=[A tV ( 0 )] is a matrix of Lagrange functions independent 
of x but depending on 0. For from (2.1) we have 

S<b ei dx = o, [f/ ^dx] =/, (2.3) 

where I is the unit matrix. Hence 

S(h - mi - - d^'ZK<&e^dx 

= Af, = 4, by (2.3). (2.4) 

Hence imder the conditions (2.1) we have 

I F| = |A«MA|. (2.5) 

We must now prove that | A | is the minimal value of | V | attainable by any set of k unbiased 
functions estimating the 8 i . Let us consider any vector u composed of such functions. Then 

lu$>dx = d it [fu/b^dx] =/, 

K u i ~me,dx =/(«,• - i^ 6j dx = o. 


(2.6) 



Hence 


On the Estimation of Many Statistical Parameters 

J(«< - 0*)(% *“ 0,*)®d# 

“ J(«< “ - 0®*? + J(«i -titei - Qtfbdx 

+J(% - /*)(** - +J(/, - 0*)(/, - Oi)<t>dx 

- J(«* ~ *<)(«* - + A,-,, 


37i 

(2.7) 

(2.8) 

(2.9) 


by substituting from (2.2) in the second and third integrals of (2.8) and referring to (2.3). 
Writing (2.9) as /% + X ij} we have to prove that 


\i+pij I = IA +M | > | A [. 


(2.10) 


It has been assumed that | A | ^ o, that is, that the derivatives O# are linearly independent 
in the functional sense. Thus A is symmetric and also positive definite of full rank h, while 
M from its mode of derivation is positive definite, or possibly non-negative definite of rank 
less than k . In any case a real non-singular matrix H exists such that 


(2.11) 


where cq > o, j 3 t * > o. Hence, taking determinants and cancelling | H | 2 , we have to prove 
that 

( a i + &)(<*2 + &) • • • (cljc + Pjc) > «ia 2 . . . a*, (2.12) 

and this is evidently true, since a i + f 3 i > > o. Further, equality occurs only when all 
Pi are zero, in which case 0 and so, since O is positive over the range of x, each 
(Ui - t^iuj - tj) = o, and so finally 

Ui**t iy s«x, 2, A. (2.13) 

Thus the functions t { (x), provided that </>(#), and therefore <D(J£), satisfies the conditions 
(2.1), uniquely ensure for | V | the minimum value | A |. 





1 


a 2 


^2 

H'KH= 

• 

, H'MH- 

• 


a k 


1- 

1- 


3. Estimation of Mean and Variance from Normal Sample 

The following example (Solomon, 1944) brings out in relief some of the special features 
of the problem of multiple estimation. Let us consider the univariate normal probability 
function 

4 >(X) 9 ) = ( 277 CT 2 )-* exp { - £0 - pfh 2 }, ( 3 -i) 

and let it be desired to estimate, from a sample x-{x 1 x z 
a 2 . We examine first d log 0 / 0 /a and d log O/ 0 cr 2 to see whether these can be expressed in 
vector form 


xf}, the mean /1 and the variance 


01og QI 80 -ArKt- 0 ), 


(3*2) 


that is to say, the reciprocal set of the equations (2.2) of estimation. It is found, however, 
that the t t and the 0 * cannot be extricated from each other in this explicit way. The fact 
is that /a and a 2 are not the parameters to estimate by the method, and on further examination, 
(Solomon, 1944) it appears that the basic parameters are really 0 x=/lc, 0 2 =o- 2 + /a 2 . This 
revised choice leads to 


( 3 - 3 ) 


d log 3 >/ 00 x 

N 

02 + 0x 2 “ 0x 


A 

d log fl>/00 2 


1 

1 

_1 





372 A. C . Aitken 

which is now a relation of the desired form, and by inspection the estimating functions for 
a and a 2 + y? are 

4 = HxjN, 4 = Hx*jN, (3.4) 

namely the first and second moments about the origin of measurement, which may be arbitrary. 
At the same time the sampling variance matrix of 4 and 4 is the reciprocal of the premulti¬ 
plying matrix on the right of (3.3), and so is 

a --!" 1 2IM 

2(a 2 + 2f6 2 ) 



The sampling variances and product moment of t x and t 2 are the respective diagonal and 
non-diagonal elements of A, and by inspection the minimum value of | V\ is 

|A| = 2o 6 /^. (3,6) 

These results may be contrasted (Solomon, 1944) with those arrived at by the customary 
process of successive estimation, according to which the estimates of fx and cr 2 are 

x = Htx[N, s 2 =%(x-x) 2 l(N~x), (3.7) 


these being unbiased estimates, independent of each other and having the variance matrix 



2<J 4 /(iV’~ i)J 


The value of | V | for this pair of estimating functions is 

| V ] = 2a*j{N(N — 1)}. 


(3.8) 


(3-9) 


4. Probability Functions Admitting the Postulates 
By solving the simultaneous partial differential equations 

8 log <D/00 «A-*(/-$) (4.1) 


we may ascertain the form of <I>, and so of <f>. The solution is of Koopman type, in the multi¬ 
parameter form 

=exp {/(*) + F(6) + 24 (%<($}, (4.2) 

where x } 8 continue to represent vectors, and where 


or, briefly 


dF 

+/C i2^2+ • • * + 


to 


tyi 

ddj Kii> 


-{8F/88}=X6, K— [ K{f ] -A- 1 = idyl 86]. 


(4-3) 

(4-4> 


The normal probability function evidently belongs to this class, and indeed the 
comparison of 


log log (aw) - £ (log O’ 2 + 1 ~) + x— 2 - jL 

\ o J cr a 


mere 


(4-S> 


with (4.2) might have suggested what is indeed the case, namely that the basic estimating 
functions here involve 2# and 2« a separately. 


5. Determination of the Basic Parameters 

Let us suppose that the parameters admitting the procedure of estimation, the basic 
parameters, are not 8 lt d 2 , . . ., 9 k , but certain functions T t (ff) of these with non-singular 



On the Estimation of Many Statistical Parameters 373 

Jacobian with respect to the dj. We have then, by differentiating the logarithm of the 
Koopman function, the vector relation 


{d log O/ 0 t}= {dPjdr} + [dyjdrjt 
-[■ 8yl8r]{ie + dFldy}. 

Comparing this with the equation of estimation, 

t~r =A{0 log 4 >/ 9 r}, 

t — ~{dP/d y }~ -[dy/dd^dP/dd}, 


we have 


( 5 -x) 

(5.2) 


( 5 - 3 ) 


where [dyjdO] is the Jacobian matrix of the functions y with respect to the parameters 0 ; 
and the estimating functions, apart from a constant factor JV, are the t { (x). 

It is instructive to determine the five basic parameters and their estimating functions in 
the case of a bivariate normal sample. We have then 


where 


log 5= - log (2tt) - \ (log | V\ +ix'V~ 1 n)+x'V~i [ jL-izx'V- l x, 


x~{x x x 2 }, ^={/x 10 /x ox }, V 


^ [>20 fin] 

Un pm ! 9 


(54) 

(5.5) 


in the usual notation of bivariate moments. Here V is the variance matrix of x={x x x 2 }, 
the pair of correlated normal variates. The complete sample vector X would thus be of 2 N 
elements. We might presume by inspection that the estimating functions would be the means, 
and the mean squares and products, derived from the N sample pairs. Referring to the 
Koopman form, we have 

F=-\ (log | 


and we begin by taking 
This gives 


8 F 

86 " 


0 “{M 10 Moi M20 Mu Mo a)- 


i* r i- i r Mo* 

-Mil] 


• 

• 

* 

- 


fiio 

L-M11 

M20J 


• 

• 

• 



fioi 


. 


■£M 02 

~fiQ 2 fill 

iMii ' 



fi 20 filO 

• 

- 1 

V | -2 

" fiQtfill 

fill + fi 02 fi 20 

~ fi 20 fill 



fill ” fiiofioi 

_ . 

. 


. lM?i 

“ fizofin 

ImIo 

. 


. M02-M01 . 


( 5 - 6 ) 

( 5 - 7 ) 

( 5 - 8 ) 


( 5 - 9 ) 


Denoting by U the premultiplying matrix on the right of (5.9), we find that [ 8 y/ 86 ] is the 
following matrix:— 


U 


where 


I 

■ ■ ■ ■} 

1 ... 


I 

- 2 fiio 

. 1 . 


Jf / 

-fiQl 

-M10 • 1 • 


dfji 

.. 

- 2/ioi • • I - 




(5-io) 


pt 2 l={p.f 0 MioMoi MoJ- (S- 11 ) 

Applying now the rule (5.3), we find that V and U~ x annihilate each other and leave 


( 5 -i 2 > 


I • * • • 

-1 

^10 

I . . , 


fiQl 

”* 2 fil 0 * I . • 


fi 20 -fil 0 

“ fiQl *“ filQ * 1 


fill “ fiiofioi 

. . - 2 fi 01 • I. 


_ fi 02 "fill . 



374 


A. C. Aitken 



I . . 




- ^ 10 


I ... 


Pqi 


A* 01 

= 

2fX l0 . 1 . . 


faO-lAo 

= 

A%> + /4) 


A* 01 f*io • 1 


P 11 ” MioM'Ol 


^u + AWoi* 


. . 2fl 01 . . 1^, 


- ^02 "“M01 • 


_A*02+A4 * 


Thus the functions r( 0 ) are indeed the moments of first and second order, and those of second 
order are taken about the origins of measurement, not the means of sample. It follows 
also that the variance matrix of the estimators is 


[dy/d i t]~ 1 = 


I . 

u - 1 

[r ¥ 21 ] 

dp’ 


. I 

dp 




(5-14) 


which is readily evaluated, and in § 6 is given in a general form. 

From (5.9) and (5.14) the generalized variance is | dy/dr I" 1 * 8 | F| 4 , or, for a sample 
of N values, 8 | V | 4 /iV 5 . 


6 . The Multivariate Normal Case 

The argument of § 5 can be extended by matrix notation to the case of a multivariate 
normal sample. First, the matrix of (5.9) is of a recognizable type. It is seen to be partitioned 
into two isolated principal submatrices, of which the leading one of order 2 x 2 is evidently 
V- 1 , and the other, of order 3 x 3, is where is the second induced or Schlaflian 

matrix, defined as that matrix which, when V transforms an arbitrary vector jst, transforms 
the vector £** having for elements (as in (5.11)) the squares and products of elements of z , 
in due priority. The properties of V 121 , as of the more general V lk \ are well known in the 
literature. 

To show that the results are of the conjectured form in the general multivariate case we 
introduce operators of vector and matrix differentiation and examine their properties. Let 
us denote by 

M=fc ( V 2> - • • rt Vr= [%1 (6- 1 ) 

the nx 1 vector of means and the nxn matrix of variances of the n normal variates 

x={x 1 x 2 . . . x n }. Let us also write 

djdp, djdfjb' and (6.2) 

where % = i, e zV =J, i & j, for the vector and matrix operations of differentiation associated 
with the elements of fi and of V, the operands being scalar functions of these elements 
occurring as elements of vectors or matrices. Then we find without difficulty, 

=3 v-y, F-y) = v-\ Q log | v | = v-\ (6.3) 

ay'F-y>=(FP])-yM o ( F-y)=(F m )~^> (6-4) 

where fP^ is the vector which has for elements the squares and products of the elements of 

duly arranged. It is to be noted that the operator £1 differs from the Cayley operator 

[djda^ in that the latter {cf. Turnbull, 1927) refers to a general matrix A — [<%] of n 1 

independent elements, whereas in the present case V is symmetric and has. only \n{n +1) 
independent elements. 

By way of illustration of (6.4), let us obtain Q \x! by making a census of various types 
of result; for example, if [ V iS ] denotes the cofactor of % in | V |, then we have 

[a^(l v u \i\ f|)]=(| v ik \ | v n l + l v u \I F jh \)/\ v\\ 


( 6 . 5 ) 



On the Estimation of Many Statistical Parameters 375 

the right-hand member being a typical element of (F^ 1 ) 1 ^ ^(F 1 ^) -1 , and so 

£V (6.6) 

At the same time, 

^=[ I I/I V\]={ \ V ij \\v\j \ V\*\ (6.7) 

- ‘2®»( I I I Pwl + I Vn \\Vn\)!\V l 2 l, >5 < /, (6.8) 

L*,i J 


many redundant terms vanishing in the summation because '^? u v kl V il = o , i ¥> k > 

x 

Thus we obtain a vector form, useful for our purpose, of the relation of V- 1 to F, namely 
that if all the \n(n +1) elements in and above the diagonal of F” 1 be written in order, row 
after row, as elements of a vector, then this vector is 

m- 1 *, (6.9) 


where v is the corresponding vector containing, in the same order, the \n{n + 1) elements of F. 

Hence finally, gathering all these results together and writing the partitioned vector 
{/x i v) as 0, we have 


{8F!86}= 


1 - —1 

«■> 

1 1 


~ y—1 

• 1 

[8y/86]- 


*■ 

[ V . K^" 1 ] 

■ / 

Ml 3 

_ 8/x, 


(6.10) 


(6.xi) 


As for the submatrix we shall use the easily proved result 


[S/x^/fy^F = 2 /x C2] , 

d 

which is the vector analogue of the scalar relation j- 2 2 = 22 2 . 
^ = 2 we have 5 


(6.12) 

For example, in the case 


M = I>10 /%], 


Mio/^oi M01K 


(6.13) 



~ 2 FlO 

— 

r.. 1 

F 10 

[dfi^/dfx]fx = 

F02 

F10 

M =2 

1 

F 10 F 01 


* 

2 Foi_ 

Lr'OlJ 

- F 01 - 


and similarly in general. 

From (5.3) the vector of parameters to be estimated is 



r / 




F 

[ 0 y/ 80 ]- 1 {Si ? /a< 9 }= 

N ’ 

11* 

_1 



— 



(6.14) 


(6.15) 


The final outcome is that we estimate the means /x (2) , . . ., n M and the quadratic moments 
about the origin of measurement. At the same time the sampling variance matrix of the 
estimators is 



=iV- 1 


' I 
djPl 
3{jl 



~~V . ' 

. 2 V& 


I 


dfjP®~ 
Bp' , 

J 


(6.16) 



376 

that is, 


A . C. Aitken 



A =J y-i 


Lf- 


2^ + 



dp dp' J 


(6.x 7 ) 


from which all the variances and covariances may be readily found. This is the matrix 
generalization of Solomon’s result of (3.5). By known facts regarding the determinant of 
induced matrices, the value of | A | is found from (6.16) to be 


2 i*<n+l)JV'-i»<n+ 3 ) | y | | y | n+l = 2 -n( 2 ^ j y)in(n+ 3 ) | y | n +2 (6. 1 8) 


in agreement with the values already found, 2 o- 6 /iV 3 for n = i and 8 | V \ *‘jN & for n — 2. 

At this point some comments may be made. The guiding postulates, like any others 
that might have been adopted, are arbitrary, and have led to formulae of estimation differing 
slightly from the customary ones. It is usual, in a multivariate normal sample, first to 
estimate the n means in the usual way, then to estimate quadratic moments about the sample 
means as follows:— 

v ij ='Z i (x i -x^(x j -x j )l(N- 1). (6.19) 


The justification for the divisor N -1 is that with this divisor the estimate 6 if is indeed unbiased, 
in the sense that its mean value over all possible samples is v i5 . When n exceeds i, however, 
the estimating functions of the % are not all independent. The v u and 9 # are independent 
of one another, but the v ih i ^ j are dependent on the v u and v 5j . The variance-determinant 
of these estimates and those of the means is 


{ 2 /(A- I )}lMn+l) N -n | V I *+ 2 , (6.20) 

which stands in contrast with (6.18) above. 

Such divergencies do not affect procedures of estimation and tests of significance adversely. 
If estimating functions are computed from sample values according to a logical prescription, 
and if tables of probability are prepared in the light of this prescription and no other, then 
the tests based upon them have their own validity. 


7. The Case of Ordinary Linear Least Squares 

The point of interest arises whether, in the classical case of Least Squares applied to 
linear observational equations, the adoption of the postulate of unbiased solutions and minimal 
determinant of variance will produce a result differing from that given by the normal equations 
of Gauss. The answer to this question has been given (Aitken, 1934) almost explicitly; 
namely the results are exactly the same. In fact, let the observational equations be the pre¬ 
pared set Ax-u* Then the usual normal equations are A'Ax-A'u . Now, adapting the 
present postulates to linear estimation, let us take the solution to be x*=*Bu, where 

(i) Bu is unbiased, which in this context means that BA =/; 

(ii) the variance determinant | BB' | is minimal. 

Now 

\BB’ \=(B 3 'y«\ (7.1) 

namely the nth compound of 33 ', while the nth compound of BA —I gives, by the multi¬ 
plicative property of compound matrices, the compound condition of unbiasedness, 

( 3 A) {n) =* 3 {n) Ai n) =/. (7.2) 

Thus we are minimizing the trace of ( 2 ?i?') (n> subject to (7.2), and this is equivalent 
(Aitken, 1934) to one form of the accepted postulates of Least Squares, but at the level of 
nth compounds, and therefore leading to the solution 

3M = [(A'A)wy\AM)\ 


(7.3) 



377 


On the Estimation of Many Statistical Parameters 
Now this is the nth compound relation corresponding to 

B-{A'A)~ X A\ that is, to A'Ax-A'u, (74) 

Thus the solutions of the classical normal equations minimize at the same time the deter¬ 
minant | BIT |; and so the solutions according to our postulates, provided only that they 
are unique, coincide with those given by the classical procedure. An exactly similar argument 
holds for the case where the observations of the linear functions are affected by correlated 
errors. 

This discussion suggests a variety of alternatives to the postulates of minimum determinant 
of variance. It suggests that we might minimize the trace, or the trace of any compound, 
that is, the sum of principal minors of a- given order m. It is likely that the results would be 
exactly the same as we have found. 

Assisted in publication by a grant from the Carnegie Trust for the Universities of 
Scotland. 


REFERENCES TO LITERATURE 

Aitken, A. C., 1934. “On least squares and linear combination of observations”, Proc. Roy. Soc . 
Edin., A, LV, 42-47. 

Aitken, A. C., and SlLVERSTONE, H., 1941. “On the estimation of statistical parameters”, Proc. 
Roy. Soc. Edin., A, LXI, 186-194. 

FISHER, R. A., 1921. “On the mathematical foundations of theoretical statistics ”, Phil. Trans., 
A, CCXXII, 309-368. 

-, 1934. “Two new properties of mathematical likelihood”, Proc . Roy. Soc., A, cxliv, 285-307. 

Geary, R. C., 1042. “The estimation of many parameters ”, fount. Roy. Statist. Soc., LV, 213-217. 
Koopman, R. O., 1936. “On distributions admitting a sufficient statistic”, Traits. Amor. Math . 
Soc., xxxix, 399-409. 

Rao, C. R,, 1945. “Information and the accuracy attainable in the estimation of statistical 
parameters”, Bull. Calcutta Math. Soc., xxxvir, 81-91. 

Solomon, I,., 1944 . “The estimation of statistical coefficients from sample”, Thesis submitted for 
Ph.D., University of Edinburgh. 

Turnbull, II. W., 1927. “On differentiating a matrix ”, Proc. Edin. Math . Soc., ser, 2,1, 111-128. 
Wilks, S. S., 1932. “Certain generalizations in the analysis of variance”, Biometrika, XXIV, 
47 1- 494 - 


CORRIGENDUM 

In the paper on single-parameter estimation (Aitken and Silverstone, 1941) it is stated 
{§ 5) that the mean can be estimated, in the sense of the postulates, as a basic parameter in a 
series of Type A. Closer examination (Solomon, 1944) shows that this possibility is ruled 
out by the condition of total probability. The normal curve is the only curve of Type A for 
which the statement is true. 


(.Issued separately September 22, 1948) 




( 378 ) 


XXXIX,— Transformations of Hypergeometric Functions of Two Variables. 
By A. Erdelyi, Mathematical Institute, The University, Edinburgh 

(MS. received May 29, 1946. Read November 18, 1946) 


1. There are several methods for obtaining transformations of hypergeometric functions 
of two variables. 

Firstly, by transformation of the hypergeometric series. When the double series is 
rewritten as an infinite sum of hypergeometric functions of one variable, the known trans¬ 
formation theory of such functions can be applied to each term. This method is quite simple 
and, in a limited range, very effective for discovering transformations as well as proving them. 

Secondly, by transformation of the systems of partial differential equations satisfied by 
the hypergeometric functions. This method, though simple in theory, is rather laborious in 
practice and not very useful for discovering new transformations. 

Thirdly, by transformation of contour integrals representing the hypergeometric functions. 
This third method will be applied here to prove a number of transformations. Some of these 
transformations have previously been obtained, by other writers, by the first method; and 
others could be so obtained. There are, however, transformations, such as (9.x), which it 
would have been difficult to discover by either of the first two methods. 

Only complete hypergeometric functions of order two will be considered. In the notation 
of Horn* these are the functions for which p—p'—$**$* ** 2, and which are denoted by the 
symbols F 1? . . ., F 4 , G x , . . ., G 3 , H 1? . . ., H 7 . (The definition of these functions is given 
in Horn’s paper.) Transformations of confluent functions can be deduced from those of 
complete functions by limiting processes. 

Even with this restriction it would be tiresome to list all transformations. A selection has 
been made so as to exhibit a property of hypergeometric functions the discovery of which 
caused me to develop, three years ago, the present transformation theory. It will be seen 
that with three possible exceptions all complete hypergeometric functions of two variables 
and of order two can be expressed in terms of AppelPs series F 2 the basic importance of which 
is thus revealed. 

2. The familiar abbreviation 


will be used. 


<*)•- 


T(c + n) 


The contours of integration will be Pochhammer double loops and [a l9 . . a m ; i lf . . i J 
will be the notation for the double loop which contains a l9 . . a m within one loop and 
hi • • •> within the other. It is understood that all other singularities of the integrand are 
outside the contour of integration. z a is interpreted as exp (a log z), where log z is real when 
z is positive, and continuous on the contour of integration. With these conventions 


f if-*#- 

[0;l] 


(27T2*) a 

T(i-a)r(i-fir(a + P) 9 


(2.1) 


and all our contour integrals will be based on this formula. 

It will be assumed that x and y have such values that the infinite series occurring in the 
analysis converge, and that the grouping of singularities indicated in the symbol for the double 
loop is possible. Exceptional values of the parameters which would make some of the gamma 
functions become infinite are tacitly excluded. The general validity of the results follows by 
analytic continuation. 


(7). 



Transformations of Hypergeometric Functions of Two Variables 379 

Hypergeometric functions of two variables, x and y, will be represented by integrals of the 
form 

^{-ty‘Xt-iy-f(u)g{v)dt, 

where u is a function of x and i, and v of y and t. We then say that the Euler transformation 
factorises our hypergeometric function, i.e. maps it into a product of two functions,/and g, 
each of which satisfies an ordinary differential equation. 

Linear Transformations 

3. From the definition of F a the integral representation 

F a(p +p' -L A A> Y, y'\ x,y) = (2m)~ 2 Y(p)T(p')Y(z-p-p') 

x|(-r)-^-i)-P'F^, 0 ; y; fjF^p', 0 '; /; ^dt, ( 3 .r) 

with [o, x; 1, x-y] as contour of integration immediately follows. This is the factorised 
form of F 2 ; the general solution of the system of partial differential equations of F 2 is obtained 
by replacing each of the Gauss series in the integrand by the general solution of the hyper¬ 
geometric equation, and the contour by any contour closed on the Riemann surface of the 
integrand. 

Applying Euler’s transformation * 

F(a, 5 ; c; z) = (i -z)~ a F^a, c-b; c; (3.2) 

to the first hypergeometric function in the integral in (3.1) and introducing (x - t)j{x -1) as a 
new variable of integration, the transformation t 

Fa(a> A A; y. /; y)-(i- *)-“F a (a, y - 0 , 0 ', y, /; (3.3) 

of F 2 is obtained. Applying (3.2) to the second or to both hypergeometric functions in the 
integral in (3.x), two similar transformations i follow. 

4. Another important transformation flows from the relation connecting different branches 
of Riemann’s P function. It will be convenient to write this relation in the form § 

F(a, b; c; z) = p|^p^ (i -z)~ a F^a, c-b; i+a-b; ~^j+a^b, (4.1) 

where the symbol + a b indicates that on the right-hand side a second term must be added 
which originates from the first one by interchanging a and b. 

The transformation alluded to expresses a certain combination of two functions F 2 as an H 2 , 
and we shall therefore start with the integral representation 

H 2 (p - a, 0 , y, 8 , e; x, y) - (2wz) -2 r(p)r(i - a)r(i -p + o) 

x | (-tfPit-ir-^L, 0 ; e; ~yF(y, 8; <r; (t-i)y)dt (4.2) 

[o,*: 1] 

* which follows from the definition, and is the factorised form, of H 2 . For the present purpose 
it is more convenient to use the factorisation 

H 2 = -p + o)Y(x + y - cf)T(x +8 - cr)/r(i + y + 8 - o) 

x|(- t)-P(i - i ) <7-1 f(p, 0 ; e; ^F(y, 8; i+y + 8-cr; i-{t-x)y)dt. (4.3) 

* 00 * P* 3 > equation (6). t (i), p* 25, equation (31')* 

J (1), p. 25, equations (31), (32). 

§ (*)» P- It equation (io ? ): note the relations y & =y n andj/ e ^y 12 . 



380 ErcUlyi 

The proof of this integral representation with [o, x; i] as contour follows from the remark that 
the difference of the right-hand sides of (4.3) and (4.2) is the integral over [o, x ; x] of a function 
which is regular at /=i: this difference therefore vanishes. 

The integrand of (4.3) is regular at t=i +y~ 1 and therefore the contour [o, x; 1] may be 
replaced by [o, x; 1, 1 +y~ 1 ]. Using this latter contour and applying (4.1) to the second 
hypergeometric function in the integral (4.3), this integral breaks up into two, each of the 
form (3.1): the result is the transformation 

H 2 (a, p, y, 8, e; x, y) = - a - y/ _7F2 ( a + y ’ & y ’ e ’ 1 + F -S; x > ~^j +y (44) 

At first this result may seem to suggest that conversely F 2 might be expressible as a linear 

combination of two functions H 2 ; this, however, is not the case. Indeed, F 2 ^x, can be 

expressed as a combination of two hypergeometric series in x, y: one of these series is an 
H 2 («, y\ the other, however, is a series which is not contained in Horn’s list * and, in fact, 
according to the accepted classification would rank as a hypergeometric series of order three 
(while F 2 and H 2 are of order two). This appears to be an indication of the inadequacy of 
the classification of hypergeometric series of two variables accepted at present. 

5. Similarly we obtain a transformation of F 3 into two H 2 . From the definition of 


F„(a, a', P', p+p; x, y) = (2m)~ z T(i-p)T(i-p')T(p+p') 

X f (-rv-iy'-^a, P; p; ix)F(a', p'; p'; (x -t)y)dt. ( S .x) 
CO; X] 

As in the previous case, this representation is equivalent to 

f 3 =(27 tx)~t(i - P ')r>+ P ')r(x+ a -p) r(x +p- p)/r(i +a+p-p) 

*j(-^-x/-F(a, P; i+a + P-p; x-xi)F(a P'; p'; (x ~i)y)di, (5.2) 

the contour of integration being either [o; 1] or [o, ar 1 ; 1]. With the latter contour, (4.1) is 
applied to F(. . .; 1 -xt), thus causing the integral in (5.2) to break up in two integrals, each 
of the type (4.2). The transformation 

F a (a, a', P, P', y; x,y) 

+ «“y> «, P', r+a-p; -yj+a^= p (5.3) 

follows at once. There is a corresponding transformation iny of F 3 . Combining (5.3) with 
(4.4) a transformation of F s into four F 2 results, f This transformation together with (5.3) 
and its analogue i ny constitute the extension to F 3 of (4.1). 

The transformations derived in sections 3-5 and their diverse combinations are the only 
linear transformations between F 2 , F 3 , and H 2 with arbitrary values of the parameters. 

# 6. In this section linear transformations of F 2 and of the associated functions F 8 and H 2 
will be discussed when the five parameters are connected by one relation. 

First let us assume that a —y -y f + 1 = 0 in F 2 . Then in (3.1) we may take p =y, p r =y\ 
and have 


F 2 (y+/-I, P, p\ y, /; x, y) = (2w/) -a r(y)r(y , )r(2 —y — y') 


LO, c; 1,1 -|f] 


(-W-i)“ r (i“7 


X 


1 — t 


- 0 ' 


dt . (6.1) 


* (7)> p. 383- 


t (0» P- 43> equation (34). 


Transformations of Hyper geometric Functions of Two Variables 
Here we may expand 


381 




= (x - x)~P[ x - 


# x — A"' 3 


I -X / 


=( x -*)- p 2 


(ft. 


m=0 ' 


X X -t 


ml Vi -x t 


and find at once 


x 


F 2 (y+y'~ V& y> y'; WHJi-y, ft, y+y'-i, p', y; —, -7L (6.2) 




If the factor {1 -y/(i - i* 1 (6.1) is also expanded in a similar way, a second transformation 

of the same function is found, viz. 

* — (6-3) 


F a (y +/ - 1, ft ft, y, y'; x,y) = (x -x)~^x -y)~F G 2 (ft ft, x -y, i -/ ; t % _ y 
On the other hand, if a-t-a/ = y in F 3 , we may take p=a, p ~ 0! in (5.1): writing 
F(a', ft; a' ; (x - /)y) = (i -y +y /)~ l> '= (x x + — ^ 


i -y. 

and expanding in powers of /, we immediately obtain the transformation formula * 


F 3 (a, a', p, p f , a + a'; y)**(x -y )-*Fd (a, j8, a + a'; a?, 




(6.4) 


Many more transformations of the functions considered in this section are obtainable by 
the standard procedure of transformation of the variable of integration combined in some 
cases with deformation of the contour. 


Quadratic Transformations 


7, With our method, quadratic transformations of hypergeometric functions of two 
variables arise in two different ways. From integral representations of the type used in 
sections 3-5, quadratic transformations of hypergeometric functions of two variables arise as 
consequences of such transformations of hypergeometric functions of one variable in the 
integrand. For the more special functions possessing integral representations of the type 
used in section 6, quadratic transformations are the consequence of two of the exponents of 
the integrand being equal. Let us first obtain some transformations belonging to the first 
group. The quadratic transformation of Gauss's series will be used in the form f 

F(a, ft 2 ft; 2) = (1 - §*)“°F^ra, £a + £; P + \\ ( 7 -i) 

If y = 2/3 in (3.1), (7.1) can be applied to F(*//) and the result is 

Up+p'-h ft ft, 2ft /; x^) = (2m)-T(p)r(ft)r(2-p-pO 

xj(^-/)-p(/-x)-p'F^p, fr+x; p+b (ft, ft; y'; ~) dt - <7.*> 


Introducing a new variable of integration u by the substitution t~%x + (i -^x)u } the integral 
in (7.2) changes into 


(1 j"(- x)-P’F\^p, ip + 1; /3 + *; (a - x y u * )\P'’ y ' ; 


(r -&e)(i -u) 


du, (7. 


where the contour of integration in the zd-plane is [o, xj{2 -x); 1,1 —y/( 1 -£*)]. Expanding 
the hypergeometric functions in the integrand, the quadratic transformation 


( ^2 2 
®, P , P +i, y; ^(2 — *) 2 ’ 2 —* 


(74) 


* (*), P- as, equation (34). 


t (6), equation (45). 




382 A. Erdilyi 

is obtained. Similar is the proof of the transformation 


H a (a, ft y, 8, 2/3; x,y)=(i~ y, 8, fi + }; ~ $*))• (7-5) 

Again, if in the last integral, (7.3), y'=2/3', the second hypergeometric function admits of a 
quadratic transformation which induces the transformation 

/ 16# y 2 \ 

H 4 (a, ft y, 2 j 8; a,y) = (i- £y)- a F 4 (^a, |a + £, y, /3 + £; ^3^, j. (7.6) 


“While both (7.4) and (7.6) seem to be new, their combination, a quadratic transformation of 
F 2 (a, /?, j 3 ', 2/?, 2j3 r ; #, jy) into an F 4 , has been proved in a different way by Bailey.* 

There is also another relation between H 4 and F 4 , 

F 4 (<x, cl + J — ft, y, f$ 4 -\ 1 x, y) == (1 a H 4 ^ct, a 4 - J — /?, /? + J, y5 ^ j +yj’ ( 7 * 7 ) 
which in connection with (7.4) yields a second transformation,! 

F 4 (a, a + i-ft y, 0 + x,y 2 )=(i +y)-* a F 2 (a, a + |-ft ft y, 2/3; ^~- 2 , (7.8) 

between F 2 and F 4 . 

8. The second group of quadratic transformations arises when the parameters of F 2 
(or H 2 , or F 3 ) satisfy two independent relations. If, e.g., a~y + y r -1 and in F a , 

from (6.1) 


F 2 (y+y'-i, ft ft y, y'; x, y) = ( 27 rf)- 2 r(y)r(y')r (2 - y - y') 

xJ(-i)-y(/-i)-/|i-x-y + 2xy-y(i-a;)^-^-x(i-^)^j- <*. (8.1) 

If x andy are sufficiently small, the contour of integration, [o, x; 1, 1 -y], may be so chosen 
that \y(x-x)i/(i-i)\ + \x(x-y)(i-i)li\<\i-x-y + 2xy\ at all points of the contour. 
This being so, the expansion 

{. . — —)’( <l - y) — A ” 

" ml nl\i *-x-y + 2yx i -t] \x-x~y + 2xy t 

may be used in (8.1) to give 


F a (y +y' -1, ft ft y, y'; x, y) 

= (i-x-y + 2xy)-Wft i-y, i-y'; — — , —— - - Y 

\ ' i-x-y + 2xy 1-x-y + 2xyj 


Similar is the proof of the quadratic transformations 
H 2 (€ +y “ x, ^5 y } i —y, €; x, y) 


and 


= (1 +j) 1 - € -y(x + ay)*-iHe( i - €, « + y - i, j8; - 


jv(x +jy) a?(x + 2y ) 

(1 + 2y) 2? 1 +y 


F s (a, a 1 - a, a + a'; j/) 


# (# - 1) i 


= (1 -a?) a+a ' _1 (i — 2#)“ a/ H 3 ^a', ft a + a'; 


If, on the other hand, the expansion 


2x) 2>m/ I — 2XJ 


(P)m+nfy(l -X)\™fx(l -y)\» 


ml n\\ x ~t 


(8.2) 


(8.3) 


(8.4) 


(2), § 3 » 


t (1), p. 27, last formula. 


Transformations of Hyper geometric Functions of Two Variables 
is used in (8.1), the result is * 

F 2(y + y ~ I > A A Y > y'i x > y ) a = F l (y + y '- l , ft , y, y'; x(i — y ), y(l -x)). 


Similar is the proof of 

H 2 (e-S, ft, ft, 8, e; x,y) = (i +xy)~^B.Je-B, ft, 8, e; — — — 

\ i+xy’ i+xy t 


383 


( 8 -S) 


( 8 . 6 ) 


A Biquadratic Transformation 

9. As an example of a rational transformation of higher degree, let us mnairlpr the 
transformation of F 2 (y + y’ - 1, 2 - y - y', 2 - y - y', y, y' ; y). In the integral representation 

(8.i)> with ft-2-y-y’, introduce the new variable u-(i -y+xy)i/{ 1 -x+xy+ (x-y)t} to 
obtain 


(2«)-T(y)r(y')r(2 -y- y')( 1 -x + xy)V~ 1 (i - y + xy ) Y ~ 1 

1 -x + xy u* 

'(1 -y+xy ) 2 u -1 


f(-W. - .>-4 +*«— -»(. 


y (i --x+xy)* 

Expanding {. . according to the multinomial theorem, the transformation 

F 2 {y + y'-l, 2 -y-y', 2 — y — y' } y, y'; #, jy) 

= (I -y + *y)y-i (l -* + ^)/-iG 3 (i -y, i *(1 Jd - ( 9 -*> 

immediately follows. 


Applications 

10. A first application of the transformation theory developed in this paper concerns the 
relations between hypergeometric functions of two variables. Leaving aside the confluent 
functions which are limiting cases of the complete ones, fourteen distinct functions of the 
second order are found in Horn’s list—those denoted by the letters F, G, H. From the 
preceding sections it is seen that eleven of these fourteen functions can be expressed in terms 
of Appell’s series F 2 . All functions F, G, H, with the only possible exception of F 4 , H x , and 
H 6 , can be expressed in terms of F 2 . With F 4 and H x such an expression is still possible 
provided that the parameters satisfy a certain relation . I am inclined to believe, though I have 
not succeeded in proving, that there is no simple connection between F 2 on the one side 
and F 4 , H x (with arbitrary parameters), and H 5 on the other. 

This reduction of all but possibly three functions F, G, H to F 2 is of great importance for 
the integration of the hypergeometric systems of partial differential equations. Leaving aside 
the confluent systems which are limiting forms of complete ones, there are fourteen apparently 
distinct types of systems of order two, corresponding to the fourteen series F, G, H. In 
spite of much valuable work done, especially by Professor Horn f and Dr Borngasser,J the 
integration of these systems still presents considerable difficulties. It is therefore of interest 
to know that a complete theory of the system associated with AppelFs series F 2 would settle the 
problem of all but possibly three of them. As to the integration of this latter system, (3.1) 
suggests that it could be accomplished by integrals of the type 


f (~t)-p(t -l)-p'p- 
'0 

• 0 1 00 1 

x\ 

0 op - 

-P- 

' O I 00 A 

0 0 p rh 


.i-y y-ft-p ft J 


lx_y' y'-ft'-p’ ft J 


where p +p' = a +1, P is the usual symbol for Riemann’s P function, and C is any contour 
closed on the Riemann surface of the integrand. 


* (4), equation (52). 


% (3). 


f (7), (8). 



384 ErdSlyi 

A closer scrutiny of our results shows that all those of the fourteen systems which have only 
three linearly independent integrals can be transformed into the system of F 1( and this in its 
turn can be integrated (as is well known) by integrals of the type 

\-H v \~d 

dt. 


| ( - t)-v{t - i)-r(i -~Y (1 


It is even true that all (also confluent) hypergeometric systems of the second order with only 
three linearly independent integrals can be transformed into the system of F x or limiting cases 
thereof: the proof of this theorem is contained in a paper which will appear in the Acta 
Mathematical 

11. Another application of the transformation theory consists in using it in connection 
with known expansions of F 2 and thereby deriving expansions of the other functions* Merely 
to give an example, we use Burchnall and Chaundy’s expansion * 

F 2 (a, ft ft y, r'; y) = £ ( " ^rf{yj^)f yr ^^ a + 2 r, P + r,y + r,y' + r; x, y) 

in conjunction with (8.5) to obtain 

F 4 (y+y' -1, ft y, y'; *(i -y), y(i -x)) 

=2 (~) r(y rKy) r (y't^ r ^ rF4 ( y + y’~ I+2r ’ P +r > y+ r > y ,+r > x >y)- («-i) 

In the same way known reduction formulae for any of the functions can be utilised in order 
to derive other such formulae. For instance, from the reduction formula 

F 4 (a, y 4 * / - a -1, y, y'; sin 2 u cos 2 v } cos 2 u sin 2 v) 

= F(a, y + y'-a-i; y; sin 2 fc)F(a, y + y'~a~i; y'; sin 2 v), (11.2) 

due to Bailey and Watson, and (7.6), 

rr ( r> a o s in 2 U COS 2 V 2 COS U $in V \ 

H 4 (y + j8-l, jS, y, 28; -7-:-rr,---7—) 

\ r r 4(1 +cos u sin v ) 2 1 + cos u sm vj 

=(x+cos«sm^ + ' s - 1 F^^i^,'^; y; sin 2 1 ; /3+£; sin 2 A (xx. 3 ) 


and from (11.2) and (7.4), 


FiljB+J^-iAjr, 2j8, 2j3' ; 


2 sm a cos ^ 2smz/ cos u 


1 + sin (& + v)’ 1 + sin (u + v) 

={x+sin (u + v)Y+r-*F(P±l-l, P±l+L } sin ^ 


<&**) (*m) 


immediately follow. By a quadratic transformation of the Gauss series the right-hand side 
of (11.4) changes into 


/ 1 +sin (u + v) / 

\(i + sin u)(i + sin v) J 


+ P'-~h 2i8; 


2 sm « 

1 +sin« 


v[P+P-hPi 


2 sin v 
1 + sin 


(**•5) 


Similar reduction formulae can be derived from (7.7) and (7.8). 

* (4), equation (37). 



Transformations of Hyper geometric Functions of Two Variables 


385 


Generalisations 


12. The methods used here apply also to hypergeometric functions of more than two 
variables. A complete factorisation by relations analogous to (3.1) would involve multiple 
integrals, and therefore in some cases it is better only partly to factorise the functions in 
question, grouping their variables in two groups. Thus properties of functions with n variables 
can be deduced from properties of functions with a smaller number of variables, and ultimately, 
by induction, from the properties of hypergeometric functions of one variable. 

As an example we shall consider Lauricella’s series * 


(cL * S- m V * X } ss: V (ft) mi .. . Wn) mn m t r m 

r " * • • » 

Clearly F^=F and F^=F 2 . The relation corresponding to (3.x) is 

F T n '(p+ P '-i; A, P'c, y<> y'i> x i ,y j )~W*T( P W)T{2-p-p') 

xj(-/)-^-x)-p'F l(p; A; y<; aW/j A ~)dt, 


(I2.l) 


(I3.2> 


where the contour of integration is a [o; 1] along which 


[ * I > I *11 + • • • + I x n I and | t-i | > \y 1 1 + . . . + \y n -1 . 

From (12.2) a proof by induction of the linear transformations of F A into itself f immediately 
follows. In the same way the expression of F B in terms of 2 n functions F A is easily obtained. 
The transformations analogous to (4.4) and (5.3) are also deducible from (12.2) and the 
analogous relation for F B . The place of H 2 is taken by the series J H n , P . There is a more 
general linear transformation representing an H W} p > by a linear combination of series of the 
type with/ >f. 


Assisted in publication by a grant from the Carnegie Trust for the Universities 
of Scotland. 


REFERENCES TO LITERATURE 

(1) Appell, P., and Kamp£ DE F^RIET, J., Fonctions hypergiometriques et hypersphiriques . 

Polynomes THermite , Paris, 1926. 

(2) BAILEY, W. N.,Journ. London Math. Soc., XIII (1938), 8-12. 

(3) BorngASSER, L., Dissertation , Darmstadt, 1933. 

(4) Burchnall, J. L., and Chaundy, T. W., Quart. Journ. Maths., XI (1940), 249-270. 

(5) Erd^LYI, A., Nieuw Arch, for Wskde (2), XX (1939), 1-34. 

(6) GOURSAT, E., Ann. de Vficole Norm. Sup. (2), XVI (1881), Supplement, pp. 3-142. 

(7) Horn, J., Math. Ann., CV (1931), 381-407. 

(8) Horn, J., Math. Ann. , CXI (1935), 638-677; CXIII (1936), 242-291; CXV (1938), 435-455* 

Monatshefte fur Math. w. Phys:, XLVII (1938), 186-194; XLVII (i939)> 359-379- 


* (1), p. 114, equation (l). 


t (i)> p. 116, equation (9). t (5), § 5* 


(.Issued separately September 22, 1948) 




( 387 ) 


XL.—The Linear Difference-differential Equation with Constant Coefficients. 
By E. M. Wright, University of Aberdeen 

(MS. received January 4, 1947. Read May 5, 1947) 

Introduction 

1. The general homogeneous difference-differential equation with constant coefficients is 

m to 

(i.x) 

fissO v = 0 

where x is a real variable, y i 0 ) (x) ~y(pc), < b x < • * • < b m , and the are any (real 

or complex) numbers independent of x . We suppose that m > 1, n > 1, and that each of 
the sets 

(o <v<n), a mv (0 <v<n), (o </x<w), (o<fi<m) 

contains at least one non-zero member. These conditions exclude pure differential and 
pure difference equations (whose theory is sufficiently well known) but do not otherwise 
restrict generality. 

As “boundary conditions” we suppose assigned the values of y (v) ( 6 ) for o <v <n and 
of y in) (x) for o < x < 6 m , y (n) (x) being integrable (in the Lebesgue sense) in this interval. 
It is convenient to regard (1.1) as an equation in the unknown function y (n) (x) and to define 
y (v) (x) (v < n -1) by 

^)(*)-yW(o)+r y^(m> 

Jo 

Thus (i.r) is essentially a difference-integral equation. 

The equation (1.1) is satisfied by 

y(x)=I,P r e*r*, (1.2) 

where s r is any zero of order / +1 > 1 of the associated transcendental equation 

t ( s )=2 (1-3) 

fX V 

JR r is any polynomial in x of degree not greater than /, and the sum in (1.2) is finite or infinite 
with suitable convergence conditions. Our main object is to show that, under suitable 
conditions, (1.2) is the most general solution of (1.1) and to determine the coefficients in 
the P T in terms of the boundary conditions. The latter problem is analogous to that of 
determining the “arbitrary constants” in the solution of a differential equation in terms of 
the boundary conditions. 

If y x (x) is the general solution of (1.1) and y 2 (x) any particular solution of 

A{y(x)}=v(x), (1.4) 

where v(x) is a known function, the general solution of (r.4) is y x {x) I show how to 

find a particular solution of (1.4) in Wright (1948 <£) which deals with a more general 
equation. 

In what follows, fx, v, p are whole numbers satisfying o o <v<n, o < p <n* 

Any statement involving /x, v or p holds for all these values of /x, v or p unless the contrary 
is expressly stated; 2 denote summation over these values of /x, v, p. The 

fX V fi 

numbers x and T are real, X is real and positive, s—o+it is complex, and M is a positive 



388 E. M. Wright 

whole number. The number C , not always the same at each occurrence, is a positive number 
independent of x , s, T and M but possibly depending on any other relevant parameters. 
The numbers C x , C z , • • • are of the type C, but C 1} for example, always retains the same 
value at each occurrence. The 0 ( ) and o( ) notations refer to the passage of x, \t |, 
T or M to oo, as may be stated; the constant implied is of the type C. 

It follows from Lemma 2 (i) in the sequel that we can arrange the zeroes s r of r(s) in a 
sequence such that | I(^ r ) | < | iOv+i) | and that there are numbers C ly C 2 and a sequence 
{T M } satisfying 

| T m -MC 1 \<C (1.5) 

and j I($r(.m)) 14- C 2 < T m < | Ifotao+i) | “ £2 ( r -6) 

for some R(M ). We shall also see that R(M)\M tends to a finite limit as 00. 

We write 

m *bfji 

■Si(*H2 y M (u)e- su du, (1.7) 

fX=l V *0 

Zr(*)-J5r x (s)+2 (x.8) 

^ v=l A,—0 

and e s r x jP(r y v : x) for the residue of s v e sx H(s)jT(s) at the pole ^ = so that 

-P(r, V, x) =e~ s r x — O, 

is a polynomial of degree / (at most) in x if s r is a zero of r(s) of order /+1. Finally we 
write 

R(M) 

S(M, v, x) = '*? P(r, v, x)e s r*. 

r=l 

We shall prove 

Theorem i ,—If a mn ¥= o, 3 the least value of bp such that a m ^o and (1.1) is satisfied 
for x > o, then 

y iv) (x )=: lim v, #) (1*9) 

for x > b and v < n - 1. The convergence is uniform in any finite interval <5 + 8<#<C(8>o). 
If y^ n) (x) is continuous and of bounded variation for o < x < b m , then so it is for all x > o 
and (1.9) holds for x> b and v = n. 

A trivial change of variable enables us to deduce 

Theorem 2 .—If a iM f s o ) b ' is the greatest value of bp such that a m 7*0 and (1.1) is satisfied 
for x < o, then (1.9) is true for x< b f and v <n- 1. The convergence is uniform in any 
finite interval ~C< x< b' -8 (8 > o). If y {n \x) is continuous and of bounded variation 
for o < x < b m , then so it is for all x <0 and (1.9) holds for x < b' and v~n. 

From these two theorems we have 

Theorem 3 .—If a ^ o, a mn ¥* o and (x.x) is satisfied for all x, then (1.9) is true for all 
x and v < n — 1. The convergence is uniform in any finite interval. If y^ n) (x) is continuous 
and of bounded variation for o < x < b m , so it is for all x and (1.9) holds for all x and v**n. 


Previous Work 

2. Schmidt (1911), Bochner (1932) and Titchmarsh (1937, 1939) discussed particular 
cases of (1.4), and Pitt (1944) dealt with an integro-differential equation of which (1.1) is a 
special case. Titchmarsh (1937) and Pitt make the hypothesis that \y^ v) (x) | < Ce° 1*1 for 
all x, while Schmidt and Bochner impose severer restrictions; the object in every case is 
to justify the use of transforms. Titchmarsh (1939) has sketched a means of avoiding this 
assumption, given the necessary differentiability. 



The Linear Difference-differential Equation with Constant Coefficients 389 

Hilb (1918) used Cauchy’s theorem to obtain the expansion (1.9) of y(x) provided (i) that 
neither a mn nor a 0n is zero and y (v) (x) exists for all v < n and all x, or (ii) that one of a mn , 
a 0n does not vanish and that y(x) has continuous derivatives of all orders for all x. 

Elsewhere (Wright, 1948 <2) I have proved (by comparatively elementary methods) a 
result for the equation 

2 2 ^^)yM(x+b„)=v(x) 

jU V 

of which the following is a very special case. 

Lemma i. —If a mn ¥* o, if y{x) is a solution of (1.1) and if y^ n) {x) is integrable for 
o < x < b my then y in) (x) is integrable in any finite interval (o, X ), y (v) {x) — 0 (e Cx ) for all 
v < n —1 and 

I* I y M {€)\d£= 0 (e Cx ) 

as If ‘ in addition , y {n) (pc) is continuous and of bounded variation throughout the 

interval (o, b m ) 9 it is both throughout any finite interval (o, X). 

This lemma and the corresponding result for negative x when a^ffo enable us to deduce 
the behavioqr of y {v) (x) at infinity from that of y M (x) in the interval o < x < b my i.e. from 
the boundary conditions. Professor Pitt, to whom I communicated my results as soon as 
my attention was drawn to his paper, was kind enough to let me see the manuscript of a 
sequel (Pitt, 1947). In this he proves a theorem effectively equivalent to Lemma 1. When 
we combine this with the result of Pitt (1944) my Theorem 3 follows. Theorems 1 and 2, 
however, do not follow from his results, and I prove them here by a method differing in 
important particulars from his. These' differences are necessary to cover the case in which 
one of a mnj a 0n is zero, since, as we shall see in the next section, the behaviour of t(s) and 
the distribution of its zeroes are much more complicated in this case than when neither of 
a mn , a 0n is zero. The former case seems to be important for applications (see, for example, 
Callender, etc., 1936; Hartree, etc., 1937; Sievert, 1941; van der Werff, 1942) and so 
worthy of separate attention. ’ 

Properties of r(s) and the Location of its Zeroes 

3. We write 

j8(s)=max | | 

fX, V 

and require the following 

Lemma 2.—(i) The number of zeroes of r(s) for which | t- T | < C is bounded for all T. 

(ii) For any C% and suitable C» r(s) has no zeroes in the region 

I ar g(-«) I < \n-C ? „ \s\>C i . 

(iii) If a mn *& o, all the zeroes ofr(s) are to the left of the line a — C^for some C$> 

(iv) If we surround every zero of r(s) by a circle of radius any C 6 we have | r(s) | > C/ 3 (s) 
for all s outside these circles. 

Of these, (i), (ii) and (iv) follow at once from Langer (1929, p. 844),*'while (iii) is trivial. 

Although Lemma 2 is all we need for the proof of Theorem 1, it will make matters clearer 
if we describe the location of the zeroes in more detail. We omit proofs, since our statements 
follow from Langer’s results. 

If a^o and a mr ffo, the zeroes of r(s) are confined to a strip <r x < a < a 2 and approach 
asymptotically the zeroes of 

m 

- 2 a ^ s (3-1) 

^=0 


* Langer states that r(x) is uniformly bounded from zero if ^ is uniformly distant from the zeroes of t(s). 
This is not quite correct; what is correct and what follows at once from his argument is that the ratio of r(s) 
to its term of maximum modulus is uniformly bounded from zero when s is restricted as stated. 



390 

for large 1 1 J. 


E. M. Wright 

The number of zeroes of r(s) for which T<t < T+C 7 lies between 


CJ> m 

27 T 


± m 


for large enough | T |. Langer’s Theorem B gives ±(m + 1) in place of ±m in this result, 
but Wilder (1917), from whom Langer quotes the theorem, proves that ±m is sufficient. 

Now let us suppose that, for example, a^o. We plot the points (b^ v) for every non¬ 
zero term of r(s). The result is an incomplete array of points arranged in columns and rows 
for which 0 <b Ii <b m , o <v<n. Since a^o, the left-hand top corner point is certainly 
missing. 

Through P 0 (b 0) i/ 0 ), the point at the top of the left-hand column, we draw the line L x with 
least slope which has no plotted point above it. L x must pass through at least one plotted 
point other than P 0 . Through P x ( 9 W v x ), the point on L x furthest to the right, we draw 
the line L 2 with least slope which has no plotted point above it, and so on. For some / > 1, 
the point P j will have vj=n and the process ends. Let be the slope of L j and (bf Jtp vj) 
the co-ordinates of P j. We observe that 

v — AJbn 

is the same (say cj) for all points on and less than c$ (by a finite amount) for every other 
plotted point. 

The zeroes of t(j) to the left of a = - C } for large enough C, are confined to strips Sj 
of finite width enclosing the curves 

C 7 = -dj log 1 1 1 , 

or, what is asymptotically the same, 

R ($ + d § log s) = o. 

In each such strip the zeroes approach asymptotically those of 

= 2 aiivS v eb* = 2, a/zv{s d ie 8 )% 

the summation extending over every v such that the corresponding plotted point lies on 
L*. If there are only two such points the zeroes of are asymptotically determinate. 

For any C? and large enough | T |, the number of zeroes in the strip Sj with T <t < T+C 7 
lies between 

277 


where h$ is the number of terms in cf>j(s). 

For suitable C the zeroes of r(s) for which | a | < C approach asymptotically those of 

m 

^(4=2 v v= X 

where b , b* are the numbers of Theorems 1 and 2. If b'^ b t the number of these zeroes with 
t in the interval (T, T+ C 7 ) lies between 


(b — b) C<j ± h 

277 ' 4 

for large enough [ T |, where h is the number of non-zero terms in If b' —b, <j>(s) contains 

only one term and r(s) has only a finite number of zeroes in any strip <r x < a < <j 2 * 

If then $ ~b m and r(s) has no zeroes to the right of a~C for suitable C. If 

we can use the same diagram as before and determine strips containing the 
remaining zeroes in the obvious way. 



The Linear Difference-differential Equation with Constant Coefficients 


39 i 


Proof of Theorem i 

4. We now suppose the conditions of Theorem 1 satisfied. Then, by Lemma 1, 

y( v) (x) = 0 (e c &) for v<n-\ and f |;y (n) (£) \dg=O(e 0 * x ) as x-±- 00. Hence 

J 0 

y„($)=f {y (v) (#) -y iv) (o)}e~ sx dx 
Jo 

is a regular function of s for a > C 8 . We have 

y v (s )=s y„__ 3 ( 5 ) - s-y v) (o) 

by integration by parts. Applying this v times in succession we obtain 

Y v {$) = Y 0 {$) - 2 5 v “ x “ 1 y (X) (o). (4.1) 

x—1 

If we multiply (1.1) by e~ 8X , integrate with respect to x from o to 00 and use (4.1), we have 


and 


t(s)Y 0 (s) = &(s) 


r(s)yM( 0) 
s 


s^Lfjs) 

r(s) 


y v (s)+ 2 s v ~ x ~ i y M (°)- 


x=o 


(4.2) 


This provides the analytic continuation of F v (r) for all values of s except the origin and 
the zeroes of r(s) which are, in general, poles of Y v (s). 

We now take the sequence {T M } of (1.5) and (1.6); all our statements contain the implied 
condition “for M greater than a suitable C”. By Lemma 2 (iv) and (1.6), 

\r{a±iT M )\>C^a±iT M ) 

for all a. We choose > max (< 7 S , C s ) and consider the contour F(M) formed of the 
four lines 

Tx(M) (&— C$, — Tm < ^ < Dm), 

T 2 {M) (t= T m , - T m < <r < C 9 ), 

r 3 (M) (cr=-T M ,-T M <t< Tm), 

r \(M) (t - - T m , - T m < a < C 9 ). 

By (4.2) and Cauchy’s Theorem, 

27 nS{M, v, #)=[ S ^|-. - ds = f Y v {s)e sx ds + 27 rfy (>,) (o). (4.3) 

t( s ) Jt(M) 

By (4.2) and the definition of j“ 7 (r) we have 

r(s) Yv(s) - rt?i(s) = - JETM - r(s) X s*~ x -y x) (o) 

x=o 



71 p-l n v 

ZZ-ZZ 


p=l x=o p=o \=0 


a fJLp e 1 >n 8 $ v+ P'- x - 1 yi X) (o). 


Hence, for a < C and | ^ | > C, 

m n n 

r(s)y n (s)-s»ir 1 (s) = - 2 Z Z °) = OM, 

0 P—0 x=p 



39 2 

and, if v < n - i, 


E. M. Wright 


m 

t(s)Y v (s)-s v H 1 (s) - 2 , 
#••*0 
n 


P~ 1 


2 Z - 22 y»>(o)-o(s"- 2 ) ( 4 -4) 

p=v + 2 X—v + l p“ 0 X—p 


as [ /1 -> co, where 2 denotes an empty sum. 

p =n+1 

By an obvious change of variable 


and 


VJ y {v) (u)e~ su du = J y {v) {bfi - v)e sv dv 

m n nbfji 

y M (^~ v ) eSVdv - 

H—1 v*=0 ^ 0 


Hence, for o-< C, | ^(j) | < C and ^(j) -> o uniformly in a as 1 1 1 -> ®. Also, by 
Titchmarsh (1937, p. 70, Theorem 49), 


f \ff 1 (C a + it)\ i d& 
J — OD 


( 4 - 5 ) 


converges. 

We now suppose that £ + S<x<C. On T Z (M), | r(s) | > Cfi(s) by Lemma 2. 
But there is at least one non-zero term with ft = o in r(s) and so /3 (s) > C for | £ | > ( 7 . Hence, 
on r,(JO> 


and so 


uniformly in x as co. 
On F Z (M) we have 



+ c\ 

s n~ 1 

r(s) 

1 

r(s) 


<C\s»\<CM\ 


)r s (M) 


Y v (s)e 8X ds 


< CM n+ 1 e- om -*o 


(4*6) 


| t(s) I > Cj 3 (s) > C | s n e bs |, 


Y v (s)e* 


Jr 


s r ^ n ff l (s) 

r( s ) 

+ c 

$n-lgb 8 

r(s) 

Hence 



Y v ($)e 8X ds 

1 

< 0(1) f 0 
J -JFjf 


<q |5iWI + ' 


uniformly in x as M-+ 00. Similarly for F 4 (M). 
Combining (4.3), (4.6) and (4.7) we see that 


2rrlS{M i v, x) -[ Y v (s)e sx ds -> 27 riy {v) {p) 
JriCJO 


(4-7) 


(4.8) 


uniformly in a; as co. By a well-known result in the theory of the Laplace transform 
(see, for example, Widder, 1941, p. 66, Theorem 7.3) 


f Y v {s)e**ds-*2'Tri{yM{%) -y«(o)} 

J 


( 4 - 9 > 


as M~>co y provided ^(x) is continuous and of bounded variation in the neighbourhood 
of x. For v <n- 1, this is always true since j (v) (x) is an integral; for v = n } it is true by 
Lemma 1 if y {n \x) is continuous and of bounded variation for o < x < h m . Hence the first 
and last parts of Theorem 1 follow from (4.8) and (4.9). 



The Linear Difference-differential Equation with Constant Coefficients 393 

To prove the second part of Theorem 1 it only remains to show that the convergence 
in (4*9) uniform in x for v < n — 1. For this it is enough to show that 


converges. By (4.4), 


and 


*00 

| Y,{C a +it)\dt 

J -CO 


(4.10) 


y v (c 9 +it) 1 < c 

n C 9 + it C* + t* 


by (4.5). Hence (4.10) converges. 


dt 


T C£ + t* 


< C 


Summary 

Under the condition that one at least of the leading coefficients a mn , a 0n differs from zero, 
the equation 

m n 

X 2 o 

fizmQ V=0 

has as solution a series convergent for all x greater (or all a; less) than a fixed number. The 
coefficients of the various terms in the series are expressed in terms of the arbitrary values 
of the solution and its first n derivatives in an initial interval of appropriate length. 

This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 


REFERENCES TO LITERATURE 

BOCHNER, $., 1932. Vorlesungen ueber Fouriersche Integrate , Leipzig. 

Callender, A., Hartree, D. R., and Porter, A., 1936, “Time-lag in a control system”, Phil. 
Trans ; Roy. Soc. London , A, ccxxxv, 415-444. 

Hartree, D. R., Porter, A., Callender, A., and Stevenson, A. B„,i937. “Time-lag in a 
control system. II”, Proc. Roy . Soc. London, A, CLXI, 460-476. 

HlLB, E., 1918. “Zur Theorie der linearen funktionalen Differentialgleichungen ”, Math. Ann., 
Lxxvm, 137-170. 

LANGER, R. E., 1929. “The asymptotic location of the roots of a certain transcendental equation”, 
Trans. Amer. Math. Soc., XXXI, 837-844. 

PITT, H. R., 1944. “On a class of integro-differential equations”, Proc. Camb . Phil . Soc., 
XL, 199-211. 

-, 1947. “On a class of linear integro-differential equations”, Proc. Camb. Phil. Soc., XLIII, 

153-163. 

SCHMIDT, E., 1911. “ Ueber eine Klasse linearer funktionaler Differentialgleichungen”, Math . Ann., 

LXX, 499-524. 

SlEVERT, R. M., 1941. “Zur theoretischer-mathematischen Behandlung des Problems der bio- 
logischen Strahlenwirkung”, Acta Radiologica, XXII, 237-251. 

TlTCHMARSH, E. C., 1937. Theory of Fourier Integrals, Oxford. 

-, 1939 * “Solutions of some functional equations”, Journ. London Math. Soc., XIV, 118-124. 

VAN der Werff, J. Th., 1942. “ Die mathematische Theorie der biologischen Reaktionserschein- 

ungen, besonders nach Roentgenbestrahlung”, Acta Radiologica, XXIII, 603-621. 

WlDDER, D. V., 1941. The Laplace Transform, Princeton. 

WILDER, C. E., 1917. “Expansion problems of ordinary linear differential equations, etc.”, Trans. 
Amer . Math. Soc., XVIII, 415-442. 

WRIGHT, E. M., 194 8 a. “Linear difference-differential equations”, Proc. Camb. Phil. Soc., 
XLIV, 179-185. 

-, 1948 b. “The linear difference-differential equation with asymptotically [constant coefficients”,, 

Amer. Journ. Math., LXX, 221-238. 

( Issued separately May 20, 1949) 




XLI.— Problems in Factor Analysis * By D. N. Lawley, M.A., D.Sc. 

(MS. received February 3, 1947. Read June 2, 1947) 

i. In cases where a large sample has been drawn from a multivariate population it is possible 
to apply tests of the hypotheses made by factor analysts regarding the number of common 
factors on which the variables depend (Lawley, 1940, 1941). There still remain,, however, 
certain points which have not yet been discussed, and it is the purpose of the present paper 
to deal with these. For convenience it would seem desirable to summarise briefly results 
already obtained. 

We suppose that there are n variables, denoted by x t (2 = 1, 2, . . . n ) 9 which obey a 
multivariate normal distribution; and, without loss of generality, we may assume that all 
means are zero. We shall denote by # the column vector of which x £ is the typical element. 
It is supposed that the x £ depend upon m common factors, represented by the column vector 
/ 0 , in addition to the n specific factors, represented by f v We may then write 

where K is a matrix having n rows and m columns, and T is an n x n diagonal matrix. The 
typical element X ir of K represents the “loading” of x i in the rth common factor; while 
r i9 the typical element of T\ represents the specific loading of x £ . 

Since the factors are assumed to be distributed independently with unit variances, the 
matrix C whose elements are the variances and covariances {%} of the ^- satisfies the equation 

It will also be found useful to express the reciprocal of the variance matrix in the form 

c- 1« +/)- 1 Ar'r-- 2 , (1) 

where 

Owing to the fact that any orthogonal transformation of the common factors leaves the 
variance matrix C unaltered, we impose the condition that J should be a diagonal matrix; 
this condition is sufficient to determine / 0 and K uniquely. 

2. Now suppose that a sample of size N is drawn from the multivariate population; 
Then if A is a matrix representing the sample variances and covariances of the x i} the typical 
element of A is given by 

a ij 2=2 __ — (% “ 

where S denotes summation over the sample and where x { is the sample mean of x £ . 

The joint distribution of the sample variances and covariances, first found by Wishart 
(1928), takes the form 

L n (da iy ), 

where 

L*\A\V*-*-*>\C\ -i^-^exp | - tr (JO 1 ) 

the notation tr (Z) being used to denote the trace, or sum of diagonal elements, of Z. 

An approximate expression, when N is large, may be obtained for log L by expanding 
in powers of the quantities { - c £j }, these being of order i/y'iV. In, doing this it must 

* This paper was assisted in publication by a grant from the Carnegie Trust for the Universities of 
Scotland. 




2 D . N. Lawley 

be remembered that (<% - c^) is identically equal to - c^). Neglecting terms of order 

i I^N or less, we find that 

log Z = a - - tr { (^ - C) C-\A - C]C ~'}, (2) 

4 

where a is a constant. 

It will be seen that the above expression represents a quadratic form in the quantities 
{*#-“**#}> which we are, in effect, assuming to be approximately normally distributed. Their 
sampling variances and covariances have been found by Wishart and may be expressed by 
the formula (E denoting the expectation or mean value) 

E{(% - ~ Ofcfc)} “ +-*«#»)• 

In order to estimate the unknown parameters \ ir we may apply the maximum likelihood 
method and choose our estimates so as to maximise the expression (2) with respect to the 
X ir . This leads to the equation (quantities of order i/y/N once more being neglected) 

KT~\A-C)=o, (3) 

V 

where K is the matrix of estimated loadings l irj and 

V V V 

C=KK' + T 2 . 

Equation (3) is the same as that previously obtained when the exact expression for log L 
was used. We are assuming, however, that the elements of T, i.e. the specific loadings, 
are known, 

3. Let U be an orthogonal matrix whose first m columns are those of the matrix 

We may then partition U and write 

Now define a matrix D by the equation 

D^U'T-'iA-OT-'U, 

so that the elements of D are linear functions of the quantities {% - c^}. Then, using (2), 
we may write 

log Z = a tr {(TUDU'T)C~\TUD U'T)C~ x ) 

N 

= a-~ tr (4) 

where 

£-1= U'TC-'TU) 

thus 

B^U'T-'CT-'U 
= U'(I+ T-'KK'T-*) U 
= U'U+ U'UJUJU 



denoting by I T the r x r unit matrix. 

The expression (4) gives us the joint distribution of the elements {d i3 } of D, which are 
clearly uncorrelated, in view of the fact that B is a diagonal matrix. If we let b u represent 
the typical element of B } the variance of d i3 can be expressed as 

(ba&tt+h /)• 





Problems in Factor Analysis 3 

The equation of estimation (3) may be regarded as imposing a number of linear restric- 

V 

tions on the elements of (A-C). These restrictions are equivalent to putting 

dii = o 

V 

where d ig is the typical element of 


D— U l T~\A -C)T- 1 U. 


Now let B denote the matrix 


r4.1i 

L 0 : ra_J 


formed from B by substituting zeros for the elements of the top left-hand quadrant; and 
let C be a matrix bearing the same relation to B as C does to B. Thus 

C—TUBU'T 
= T(UU'-U 0 U 0 ')T 
= T 2 - KJ~ X K\ 

V 

Then the variances and covariances of the elements of (A - C) are given by the formula 




}(pihPjk + Ciktyh)) 


where % is the typical element of S. 

It has been shown previously that the expression 


is distributed approximately as x 2 with %{(n - rnf - n - m} degrees of freedom; we are thus 
able to perform a test of significance on the residual covariances taken collectively. We 
have now derived expressions for the sampling variances of the residuals {% - £ iS }, considered 
separately. It must, however, be realised that the expression for C involves the matrix iT, 
whose elements consist of the unknown population parameters X ir . These parameters have 
in practice to be replaced by their sample estimates l ir . This emphasises the fact that the 
above treatment is only appropriate for large samples. Owing to the difficulty of the problem 
a treatment suitable for small samples would appear to be virtually impossible. 

Another fact which must be taken into account is that we have assumed the specific 
loadings r i to be known; or, what comes to the same thing, we have neglected the errors 
of estimation of these quantities. The errors could be allowed for, but at the cost of greater 
complexity in the formulae. It has already been shown that the estimation of the parameters 
is equivalent to the imposition of the further restrictions 


&ii c ii 0 


(/=i, 2, . . . n). 


It would therefore be necessary to find the partial variances and covariances of the quantities 
{a i9 - - c i3 ) under the above restrictions. This would involve, amongst other things, calculating 
the reciprocal of the nxn matrix representing the variances and covariances of the diagonal 
residuals {a u - c^}. The calculation would, however, scarcely be worth making, especially 
since when «, the number of variables, is large the effect of errors in estimation of the para¬ 
meters is, in general, small. 

To illustrate the use of formula (5) let us consider the case where two factors have been 
fitted. We shall then have 

. . Af, 4 







4 D. N. Lawley 

where 

n-2C^A 

i 

i 

and the typical residual (a i} - 6 ij) may be written as 

— 4i4 - A'2^3'2 ( 2 ^jO- 

4. Let us now consider the problem of determining the sampling variances and co- 
variances of the estimated loadings l iT . For large N the joint distribution of these quantities 
may be obtained by finding 

log (Z/Z x ), 

where log £± is the result of replacing X ir by l ir , for all i and r, in the expression for log Z. 
Thus, to the usual approximation, 

yy 

log (Z/Zj) = — —S{a ir; j s (?ir ~ Xi r ) (/, s — A,-,)}, 

where 

_ 1 g8 ( Io g z ) 

“*■»*" ~N d\ r d\ js ’ 

and where the summation is over all possible values of the suffices. Neglecting quantities 
of order 1 l\/N y the constants a ir> 3S may be expressed in the form 

where, for example, [K , C~ 1 ir] rs denotes the element in the rth row and sth column of the 
matrix K'C~ X K. 

It is convenient to use a set of linear functions m ir of the quantities (/ ir - X ir ) given by 
the elements of the matrix 

U'T~\R-K)J% 

= {T-*RJ-b : T-'UJiR-R)/*. 

We must remember at this point the conditions which are imposed on the loadings X ir and 
on their estimates l ir in order to obtain a unique solution. Inspection of the form of M 
indicates that these conditions are equivalent to putting 


We may now write 


0'< r). 


log (Z/Zj) = - - Zj 2 L ( Pir, is m irm ja ), 


j >0 


Pin *= [/-iJTC-'Xf-iU U'TC-'TU) U + {J-iITC-'TU} is lf-iX'C-'TU] jr 
= [3-i] rs [£-% + [Il-i] is [B-'] jr , 

the matrix B being as previously defined. Since B is a diagonal matrix it is clear that, 
for i>r and j > s, j 3 ir> js =o unless i=j and r=s. Hence 


log (Z/Z 


r)=-f{2 


f\p TT, 


The form of the above expression shows that, for large N, the quantities m ir are distributed 
- independently. The variance of m rr is b^j 2W, while for i > r the variance of m ir is b u b rT IN ‘ 
Since the quantities (l ir - X ir ) are linear functions of the m in being given by the equation 


R-R=TUMJ~t, 



Problems in Factor Analysis 5 

it is now a simple matter to find their variances and covariances. If we denote by G the 
diagonal matrix whose elements g u are given by 

gu^hi (*<*)> 

g« = o (i > r), 

grr s= 'k^m 

the variance of m ir for all i and r may be written as 

~ git)’ 

The covariance between l ir and l jr is therefore 

fyz-'U[TU(B-G)U'T] i} , 

If we now put 

Yr = L/Jrr = 2 T *), 

h 

b„ = [J +J]„ = 1 +y r (r < m ), 

0 r =i + i/y„ 

the above expression becomes 

^[TU(,£-G)U'T] {} 

JjjlC-TUGU'Tlv. 

Hence the covariance between l ir and l if is 

~ 2 (^AAa) + ( 6 ) 

The variance of l ir is obtained by putting i**j in the above expression. On the other hand, 
when r&s the covariance between l ir and l js is zero for all values of i and j. This is a conse¬ 
quence of choosing our factors in such a way that J is a diagonal matrix. The property 
will no longer hold if the factors are subsequently rotated, i.e. replaced by an orthogonal 
transformation of themselves. Since, however, the rotated loadings will merely be linear 
functions of the original ones, their sampling variances and covariances may easily be 
determined. 

Now the estimated covariance 6 $ may be expressed as 

m 

2 (Mr); 

r=l 

so that its sampling variance is given by 

2 {^?r var ( 4 ) + K var (l ir ) + aX ir X jr cov (/ ir> l jr )}. 

T 

This expression may be evaluated by using the results of (6). We thus find that the variance 
of % is 

+ 4 ) “(fii ~ 2 ~ 2 - 2 QrXirXjr^J | 

888 jyfaiifyi c ij) “ (piiCjj + Cy)}, 

where c # is as previously defined. This provides a verification that 

E (&a - ~ %) 2 + E (Sy - CtfY, 



6 Problems in Factor Analysis 

a result which is a consequence of the fact that the sets of quantities and { 6 tj } 

are distributed independently of each other. 

Summary 

A set of variables is assumed to depend upon a number of common factors and specifics. 
Formulae are then derived for the sampling variances and covariances of the residual co- 
variances obtained by removing the effect of the factors. The variances and covariances 
of the set of estimated loadings are also found. It must, however, be noted that the results 
obtained are valid only when an efficient method of estimation is used. 


REFERENCES TO LITERATURE 

Lawley, D. N., 1940. “The Estimation of Factor Loadings by the Method of Maximum Likeli- 
hoo &”,Proc. Roy. Soc. Edin ., LX, 64-82. 

-, 1941. “Further Investigations in Factor Estimation 7 ’, Proc. Roy. Soc. Edin., LXI, 176-185. 

WlSHART, J., 1928. “The Generalised Product Moment Distribution in Samples from a Normal 
Multivariate Population”, Biometrika, A, XX, 32-52. 


{Issued separately February 3, 1949) 




( 40 ° ) 


XLII.— The Nature of Scientific Philosophy. By Professor Herbert Dingle, 

D.I.C., A.R.C.S., D.Sc. 

(The James Scott Lecture delivered on July 5, 1948) 

(MS. received July 5, 1948) 

The tide I have chosen for this lecture contains an implication which perhaps will not be 
generally accepted; there was a time when I would not have accepted it myself. The impli¬ 
cation is that a particular kind of philosophy is possible which may be called scientific in contrast 
with other kinds which cannot be so called. I would go further and identify this scientific 
philosophy with what is generally called science, and this implies that the distinction that is 
often assumed to exist between science and philosophy is a false one. For this view I believe 
there is historical evidence. Science, as a separate, self-contained study, dates from the seven¬ 
teenth century. Before that time, such consideration as was given to the subject-matter of 
present-day science was given it by philosophers and regarded as a part of their philosophising, 
and when in the seventeenth century a new kind of procedure was introduced, it was looked 
upon by its pioneers not as an attack on a new problem but as a new attack on an old problem. 
The science of that time was the “new philosophy 5 ’, faintly adumbrated by some mediaeval 
philosophers, struggling for expression in Francis Bacon, and coming to full recognition in 
Galileo. Only later, when it had made such progress in certain limited fields of study that a 
new body of investigators was called into being who confined themselves to those fields, was 
the new philosophy transformed into a non-philosophy and called generally by the name 
“science”. 

I believe this to have been unfortunate in more ways than one. Not only has it created at 
least a coldness between scientists and philosophers—/.<?. between those who limit their studies 
to the phenomena readily amenable to treatment by the new philosophy and those interested 
in the no less important phenomena not so readily amenable—but also it has fostered the belief 
that science is by its nature limited to those problems in which it has been immediately 
successful. Thus there has grown up a view of the nature of science—as I shall continue 
for brevity to call what is more properly denominated scientific philosophy—which I believe 
to originate more in the accidents than in the substance of its being. Thus we are told that 
the “scientific method” is applicable only to the world of sensation, or to the measurable, or 
to the material, and so on, and, so far as I have been able to gather, such statements are never 
based on any fundamental consideration of the possibilities inherent in the “scientific method” 
but rather on its actual achievements up to now, just as one might in an earlier time have 
asserted that men would never be able to fly because the principles of navigation were applicable 
only to land and sea travel. 

It seems worth while, therefore, to try to get beneath the surface of modem science and see 
if, from an examination of its roots, we Can form a more trustworthy idea of its essential 
nature and possibilities. To do that I think the first step must be to understand as clearly as 
possible what happened in the seventeenth century, when science as we now know it may be 
said to have begun. I know, of course, that in one sense the beginnings of science lie much 
further back, and I shall not ignore pre-Galilean thought; but when all due respect has been 
paid to that, one outstanding fact stares us in the face. In the 2000 years or more during which 
men had been philosophising before the seventeenth century began, the amount of what we 
now call science which was discovered or created was almost negligible, and continuous 
progress, if any, is hardly discernible. In the 250 years since the seventeenth century ended 
almost the whole of our scientific knowledge has been obtained, and conspicuous progress 
has been not only continuous but continuously accelerated. In that short space of 100 years 
something was introduced which started a movement unknown before, and what I want to do 



The Nature of Scientific Philosophy 40 r 

is to identify that something and delineate it as clearly as possible. I do not say that by itself 
it would necessarily, in any circumstances whatever, have given rise to science. The ground 
had to be prepared before it could operate. But I do say that this new element which we are to 
seek is the seed of science, and that it has pre-eminent claims to be regarded as the factor 
which, above all others, made science possible. 

I believe that this factor is to be found in the work of Galileo. There have, of course, been 
other claims. Canon Raven,* for example, has argued cogently for the recognition of a fact of 
history which he thinks has been unduly neglected, namely the recovery, mainly in the seven¬ 
teenth century, of the habit of looking at nature without presuppositions, which was possessed 
by the Greeks but denied to the thinkers of the Middle Ages through the influence of the 
mythology of the bestiaries and other fanciful legends. He attributes modern science to the 
recovery of this habit, which is seen more prominently in biology than in mechanics, and he 
accordingly considers that the significance of Galileo and his followers has been over-emphasised. 
I do. not wish in the least degree to underestimate the importance of the unprejudiced outlook, 
but I must point out that no matter how completely it may have been recovered, it is not the 
vital factor which we are seeking; it is at best the soil and not the seed of science. The Greeks 
possessed this outlook, but they did not create science; we do not produce something new 
merely by returning to something old. I cannot find an adequate explanation of the remark¬ 
able contrast between science before and after the seventeenth century in a mere revival, 
whether of learning or of anything else. There must be something new, and something 
new, I believe, is to be found in the work of Galileo. 

I want, therefore, first of all to direct your attention to the thought and practice of Galileo, 
but not merely in order to recover his outlook. Our object is to interpret it in the manner 
calculated to give us the best understanding of its nature and potentialities. That will not 
necessarily coincide with Galileo’s own view of what he was doing. No human being, I think, 
could have foretold that on the foundations laid down by his simple experiments with falling 
bodies, a structure would have been reared that in 300 years would have presented the amazing 
fagade of modern physics in all its intricacy and comprehensiveness, and we shall hardly 
expect Galileo’s view of the new philosophy to be adequate to include all that it has produced 
and may yet produce. He builded better than he knew. From our more advantageous 
viewpoint we should be able to form a truer estimate of his achievement, and in order to do that 
we must see it not only as he saw it—as a new method of answering old problems—but rather 
as a radically new departure in philosophy in which the very problems themselves were changed 
and only the fundamental human impulse to philosophise remained as a common basis for the 
new and the old. 

What is this fundamental human impulse to philosophise? I think it can be described as 
the desire to make sense of our experience, to see it as an ordered rational system instead of a 
succession of fortuitous happenings. As soon as we become conscious we become aware that 
things happen: we see sights, hear noises, feel hot or cold, experience pains and pleasures, 
desires and satisfactions, and a host of other things too numerous to be mentioned. We feel 
also a need to find some order in this chaos, and we are conscious of some innate principle of 
order, if such a phrase may be used, which we call reason. To philosophise is ultimately to 
see all the elements of our experience in rational relations with one another, and all philosophies 
are at bottom attempts to do that as completely as possible. 

Now unless some order is found pretty quickly survival is impossible. If one has no idea 
at all of what is going to happen next—if he does not know that when he feels hungry the 
feeling may be removed by eating; that certain things may be eaten with safety and others 
not—if one knows nothing of all this he will not continue to experience for very long. Even 
the lower animals have achieved this degree of organisation of experience, and $0 may be said 
to be philosophers of a very rudimentary kind. What kind of philosophy directs their actions, 
and leads among other things to the laying of eggs in the right places, to bird migration, and 
so on, we do not know, but we can form a fairly good idea of the philosophy of early man because 
it is not so very different from the philosophy we automatically adopt to-day when we are not 
consciously philosophising at all but merely meeting the ordinary necessities of life. (By 

* Synthetic Philosophy in the Seventeenth Century (Herbert Spencer Lecture for 1945), Basil Blackwell, 
Oxford. 



402 Herbert Dingle 

“early man” I mean here the inhabitants of the most civilised parts of the earth just before 
conscious philosophy began—say one or two thousand years before Christ,) To explain the 
things he experienced, early man supposed that there was a world around him consisting of 
numerous bodies which occupied different positions in space and persisted for greater or 
shorter lengths of time. These bodies could move about in space or lie still, and one of them 
was identified by each man in a peculiar manner with himself and called his body : we will call 
it the sensitive body to distinguish it. The other objects caused experiences when they 
impinged in some way on the sensitive body. Thus, if they made contact with it there came 
the experience of touch; if they sent certain emanations towards it, or perhaps if the sensitive 
body put out feelers to reach them, there came the experience of sight; and so on. One 
could therefore in large measure predict what he would experience by following the course of 
the objects constituting the world, noticing their continuity of behaviour, and drawing 
deductions on the basis of previous experience. Some experiences, such as certain pains and 
pleasures, hopes and fears, were not readily attributable to the world of material bodies, and 
so were assigned to invisible and intangible spirits which usually shared the time and space 
but not the material character of the world responsible for sensations. This exceedingly rough 
description of our primitive ideas is, of course, not intended to be a precise and complete state¬ 
ment of the outlook of early man, but only an outline of the sort of very naive realism which 
in most of its features was so admirably adapted to the elementary needs of life that we still 
retain it. The description serves its purpose if it indicates to you what you know already, and 
the only point I wish to make here is that, in its limited way, it is a philosophy of a kind, so that 
everyone who has survived the hours of entire dependence on others is a philosopher in embryo. 

What distinguishes the philosopher properly so called from the ordinary man is that he 
desires not only to make such rationalisation of his experience as will meet his elementary 
needs, but to rationalise the whole of his experience. Primitive man needed to know that some 
plants were good to eat and some bad, but he did not need to know why some were green and 
some red, and he felt no grievance against his philosophy because it did not supply this 
superfluous information. Philosophy in the full sense of the word began when all experiences, 
without discrimination, became the object of study. It is often said that the characteristic of 
true philosophising is that it is disinterested, and this of course is true, but I do not think it is 
the best expression of the truth. It is rather the desire to find a place in our system for every¬ 
thing, whether small or large, useful or useless, past or present, without assigning originally 
any degrees of relative importance, that makes a man a true philosopher, and although this 
necessarily directs his interest to things to which he is not attracted for any other reason, that 
is ultimately an accident. A man may devote his life to the study of the social habits of ants, 
with no ulterior motive whatever, but he is not a philosopher so long as he does not grant 
every other phenomenon the same a priori significance; and, on the other hand, a man who, 
after due consideration, finds that the social habits of human beings demand more of his 
thought than those of ants, does not fail to be a philosopher because his material comfort 
promises thereby to be better served. 

Philosophy proper, then, may be said to have begun with the first man who took the whole 
of experience as His field of study and tried to see it as a related set of manifestations of some 
common unifying principle. In the Western world that man is generally considered to have 
been Thales of Miletus, who lived some 600 years before Christ, and the fact that his unifying 
principle was something now so unacceptable as water is as nothing beside the immeasurably 
important fact that he caught the first glimpse of the goal which philosophers have ever since 
been striving to reach. I need not recount the various alternative principles which successive 
philosophers substituted for the water of Thales: air, number, eternal flux, and the rest— 
they are familiar enough, and it is not our purpose now to compare their merits. But the point 
which I wish to emphasise as strongly as possible is tjiis: that when men began to improve on 
the partial and purely practical rationalisation of experience, which satisfied their ancestors, by 
attempting to create a universal philosophy, they took over that partial rationalisation as though 
it were a necessary, inescapable nucleus, and tried to complete the rationalisation of all 
experience by extending it to realms previously ignored. Thales and his successors right up to 
the seventeenth century took for granted, as though it were directly given them in experience 
itself, the world of material bodies moving in time and space and causing experiences by 



The Nature of Scientific Philosophy 403 

impinging on the sensitive body. They overlooked the fact that it was experience they had 
to rationalise, and thought they had to explain the material world. 

A single example must suffice. Early man knew that the grey metal, lead, was heavy, 
but he did not concern himself with the idle question why a substance which was heavy should 
also be grey. This was of no practical importance, but it became of great theoretical importance 
when a universal philosophy was in view, and alchemists gave much attention to it. But it 
never occurred to the alchemists or anyone else that they could ignore the objective piece of 
lead in front of them and consider greyness and weight as original elementary experiences. 
To them these phenomena were by nature associated in the given piece of lead, and to under¬ 
stand lead you had to understand that association. And so the whole of prescientific philosophy 
was built on to the practical philosophy of an entirely utilitarian age, and no one considered 
that any other basis was possible. 

There is no time to demonstrate this by tracing the course of philosophy from Thales to 
Galileo, but one or two examples might be given. I have mentioned the alchemists, but the 
naive acceptance of the primitive realism characterised their opponents no less than themselves. 
Thus, Avicenna, in attempting to refute their claims, maintained that although they might 
succeed in changing all the properties of lead into those of gold, the essential nature of the 
substance still remained that of lead. The metaphysical notion that an object, or a type of 
object, had an “essential nature” which existed independently of its manifest properties was 
not questioned by either party, and this entirely supposititious “ essential nature ” had displaced 
experience as the object of the philosopher's investigation. Again, in the great controversy 
between the nominalists and the realists, what was in dispute was the credentials of “ universals ” 
—of lead, for example, but not of a particular piece of lead. That was accepted by both sides 
as an undeniable and indivisible unit of the material world which had to be taken as given. 
And so throughout the whole of ancient and mediaeval philosophy, the problem was to interpret 
the world of material and other objects in space and time. No attempt was made to get 
beneath that world and, if necessary, shatter it to bits and build a world nearer to the mind's 
desire. 

And yet a very simple consideration shows that such a world is very unlikely to lend itself 
to a complete solution of the philosophical problem. No one even now can dispute its efficacy 
—indeed its indispensability—for its purpose of enabling us to carry on the practical business 
of living, but for the complete rationalisation of experience something other than partial 
efficiency is needed. The first requirement for such a purpose is clearly that the basis of one's 
philosophy shall be essentially rational, and in this the practical philosophy fails lamentably. 
For in any rational argument one must proceed from premisses—from evidence—to conclusion, 
and not from conclusion to premisses, and in the practical, commonsense philosophy it is the 
latter course that is taken. Experience is interpreted as a consequence of the action of the 
world of material objects on the sensitive body. But what we know immediately is experience; 
the world of material objects is what we (rightly or wrongly) infer from it, and a scheme which 
attributes the data to the action of an inference from the data thereby declares itself as essentially 
irrational. In the same breath we say there must be a world of matter because we have 
experiences, and we have experiences because there is a world of matter which causes them. 
On such a basis it is not to be expected that great systems of rational thought will be erected, 
and the history of science before the seventeenth century exhibits about as much progress as 
we have a right to expect. 

The undying glory of Galileo’s contribution to thought is that, though only half consciously, 
he discarded this everyday, commonsense world as a philosophical necessity. He paid no 
attention to material objects as such, but analysed the experiences they were supposed to 
induce in us into their elementary constituents and reassembled them differently. Rejecting, 
for example, the supposition that the motion of a body had any direct connection with its 
weight, so that a heavy body necessarily moved differently from a light one, he detached the 
motions of all bodies from the properties associated with them in those bodies and sought for 
laws of motion in itself. To him the primary affinity was not between motion and lightness or 
motion and heaviness, but between motion and motion. The unity of the material object 
was, in effect, denied (or, more exactly, the material object was not inferred as the necessary 
first step in the rationalisation of experience), but instead various examples of a particular kind 



404 Herbert Dingle 

of experience—that of motion—were grouped together, represented by appropriately chosen 
concepts, and described by means of general laws. 

It is that return to a starting-point more fundamental than that of Thales, or even of the 
semi-savages who preceded him, that I believe to be the new element that made modern science 
possible. Galileo’s treatment of motion was quickly imitated by the treatment of the other 
phenomena of physics—temperature, visibility, electric and magnetic actions, and the rest. 
Each of them formed its own group of phenomena and was studied quite apart from the other 
so-called properties of the bodies which displayed it. Concepts were chosen for each set of 
phenomena regardless of what any other set demanded. The fact that a body was in such and 
such a position in space, for example, was important in the consideration of its motion because 
space was a concept employed in the description of motion, but it had no importance for the 
consideration of the temperature of the body because temperature was described in non-spatial 
terms. It mattered not at all that the body had a unique position in space, for the body as 
such was left out of account. 

Of course, all this was not realised immediately;^ it is not generally realised even yet. The 
new philosophy was practised faithfully by men who still thought in terms of the old. They 
still assumed that the material object was a necessary datum, and regarded themselves as 
performing an act of violation on it by abstracting qualities which really had no right to a 
separate existence. They could only somewhat unconvincingly defend their action against 
philosophers of the traditional order by pointing to its success. It would be of great interest 
to look at the history of modern science, bearing in mind that we are watching the actions of 
men who are working in accordance with one philosophy and viewing their work in terms of 
another. Lack of time, however, makes this impossible, but we cannot leave Galileo without 
taking a glance at his own view of what he had done, for although he did not fully understand 
his achievement he approached as near to a full understanding as was humanly possible to 
any scientist before the present century. 

By good fortune Galileo was on one occasion persuaded to pause in his pioneer labours 
and reflect on the metaphysics which they connoted. This is what he wrote: “ I feel myself 
impelled by the necessity, as soon as I conceive a piece of matter or corporeal substance, of 
conceiving that in its own nature it is bounded and figured in such and such a figure, that in 
relation to others it is large or small, that it is in this or that place, in this or that time, that 
it is in motion or remains at rest, that it touches or does not touch another body, that it is 
single, few, or many; in short, by no imagination can a body be separated from such conditions; 
but that it must be white or red, bitter or sweet, sounding or mute, of a pleasant or unpleasant 
odouL I do not perceive my mind forced to acknowledge it necessarily accompanied by such 
conditions; so if the senses were not the escorts, perhaps the reason or the imagination by 
itself would never have arrived at them. Hence I think that these tastes, odours, colours, etc., 
on the side of the object in which they seem to exist, are nothing else than mere names, but 
hold their residence solely in the sensitive body; so that if the animal were removed, every such 
quality would be abolished and annihilated. Nevertheless, as soon as we have imposed names 
on them, particular and different from those of the other primary and real accidents, we induce 
ourselves to believe that they also exist just as truly and really as the latter.” 

It is clear from this, I think, that Galileo took the existence of material bodies for granted; 
so far he spoke in traditional terms. But it is clear also that, whatever he may have supposed 
them to be in themselves, it was his conception of them that occupied his interest. “ As soon 
as I conceive a piece of matter”, he begins, and henceforth discusses his conceptions, not some 
possibly inconceivable essence that might be supposed to reside in the piece of matter. Further¬ 
more, when he speaks of the supposed properties of colour, taste, and so on, he implies that they 
were arrived at by reason or imagination, not given as initial data, and he indicates further that 
reason or imagination has been led to postulate them in order to account for our sensations; 
it is the senses that are the escorts. All this is quite in keeping with what I have described as 
the new outlook which was necessary to bring modem physics into being. Where I think he 
does not quite attain to the view now possible is in the fundamental distinction he sees between 
what we now call the mechanical properties of bodies and their other properties. The former 
he seems to have regarded as having in some way a right to exist independently of experience, 
and to be attainable by reason or imagination without the escort of the senses. In the light of 



The Nature of Scientific Philosophy 4°5 

later experience it is difficult to maintain this distinction. No doubt the. fact that he had been 
able himself to arrive at far-reaching laws of motion, whereas the other physical phenomena had 
not even begun to be organised, had something to do with the unique status he assigned to 
mechanical qualities, but in truth the magnitude of his achievement was so overwhelming that 
we need not linger on the reasons why it was not even greater. 

We do not find the same degree of penetration in his successors. They indeed started the 
new sciences of heat, optics, electricity and the rest on the authentic lines he had laid down, 
but they lagged far behind him in their understanding of what they were doing. Instead of 
regarding “heat” and “temperature” as conceptions useful for representing experiences of 
warmth and cold (extended later to include thermometer readings), which alone are the 
fundamental data, they put them back into the reinstated material bodies as inherent properties 
of those bodies. Instead of regarding light and the space and time it was supposed to move in 
as conceptions useful for representing experiences of vision, which alone are the fundamental 
data, they gave light the status of a material object and automatically identified the space and 
time of optics with the space and time of mechanics as objectively existing entities which were 
given them for study. They did the same thing with other conceptions, putting them back 
in thought one by one into the supposedly given material world, while all the time unconsciously 
using them as they should be used, as conceptions to be moulded and changed at the dictate of 
experience. Only when the inevitable incompatibilities at last force themselves on our attention 
does it become possible for us to realise the true nature of what has been done. Our “light”, 
which we thought a constituent of the material world, turns out to have contradictory properties. 
But what can we expect if we put into the material world something which by its very origin 
cannot belong there? We postulate material objects and to explain how they reveal them¬ 
selves to our senses we postulate light. But something whose basic function it is to reveal 
material objects cannot itself be a material object, or we should want another something to 
reveal it, and so on ad infinitum . The confusion arises not from the practice of physicists but 
from the illegitimate metaphysical idea that has lurked behind that practice. Again, space 
and time were introduced into mechanics in order to describe motion, and into optics in order 
to describe optical phenomena. It is not necessary that the same space and time that serve 
the one purpose shall serve the other, and, in fact, physicists worked with them quite inde¬ 
pendently, in the true scientific manner, while all the time thinking that they were of necessity 
the same objective- things and obscuring their differences by filling them with different ethers. 
The test came in 1919, when the deflection of light in the gravitational field of the Sun was 
observed. The result of that test showed that the space and time that met the needs of optics 
were the same space and time, so far as existing knowledge went, as that which met the needs 
of mechanics. The observation was not generally regarded in that light, but that is in fact 
what it amounted to. For suppose the test had been made when Einstein first suggested it 
in 1911. At that time he still thought that Euclidean space would suffice for mechanics, and 
predicted a deflection of half the actual amount. The experiment would then have revealed 
that light suffered twice the calculated deflection, and the most satisfactory explanation so far 
as I can see—supposing anyone had been bold enough to think of it—would have been that 
light travelled in a non-Euclidean space and matter in a Euclidean space. The independence 
of the concepts of time used in the various departments of physics is, I think, still more important, 
but I have dealt with that elsewhere * and a brief discussion of it is impossible. But it is not 
necessary to follow the course of physics since Galileo’s time to realise how completely at 
variance its practice has been with its metaphysical assumptions: we need only look at its 
conclusions. The culmination of 300 years of the ostensible study of material bodies has 
been the production of the equations of the electro-magnetic field, the field equations of 
relativity, the wave equation of the electron, and the laws of thermodynamics. Where in that 
magnificent epitome of knowledge does one find the least indication that it is a world of material 
objects that is being described? 

The peculiar characteristic of scientific philosophy, then, may be expressed in this way. 
Like all philosophy, its aim is to organise the whole of experience into a rationally connected 
system, but, unlike all previous philosophies, it does not accept the world of material objects, 
located and moving in a unique space and time, as a necessary starting-point, but goes back 
* Proc. Aristot. Soc XLVIII, 1948 , 153 ; Phil. Mag., XXXV, 1944 , 499 - 



4 o6 Herbert Dingle 

to the original experiences that led to the conception of that world for practical ends, and 
groups them differently. Instead of regarding the greyness and heaviness of lead as in¬ 
dissolubly associated in the piece of lead, and the yellowness and lightness of sulphur as 
indissolubly associated in the piece of sulphur, it seeks first a relation between greyness and 
yellowness on the one hand and heaviness and lightness on the other, and only when its work 
is wellnigh complete will it arrive at a connection, almost inexpressibly indirect, between the 
colour and density of any of the pieces of matter which to the commonsense'view are indivisible 
units. It is an historical fact that this is the procedure which, from the time of Galileo onwards, 
scientific men have practised, though largely unconsciously, and I do not think it can be 
questioned by anyone who compares the science of post-Galilean with that of pre-Galilean 
times, that it is the mainspring of the remarkable impetus which the former reveals. Let us 
now look at one or two consequences which I think have been largely overlooked because of 
our unconsciousness of the true nature of scientific philosophy. 

The first is that, if the view I have advanced is correct, there is no realm of experience 
which is excluded from scientific treatment. When the world of material objects is taken as 
forming the primary data of science, a natural distinction arises between those experiences 
which can be directly traced to that world— i.e. in the main the experiences obtained through 
the five senses—and those others, including religious and aesthetic experiences, which appear 
to have other sources. A sharp difference of opinion has thus arisen, among scientists as well 
as others, as to how these latter experiences are to be regarded. On the one hand there is the 
materialistic view that they are mere “illusions”, ultimately traceable, when we have sufficient 
knowledge, to physiological peculiarities and therefore to be ignored except in so far as the 
advance of medical science may show us how to control them. On the other hand there is the 
“idealistic” view (to choose one of the meanings of an overworked word) that such experiences 
are worth more than mere sensations and come from a world just as “real” as the material 
world, if not more so. Each view claims to be scientific—more, I think, because to be scientific 
is now regarded as an honour than because of any clear conception of what being scientific 
means. 

It is evident, however, that if not the world of material objects but experiences themselves 
are the fundamental data of science, then there is no reason whatever to grant any experience 
initial priority over any other; all are alike submitted for consideration, and our rationalisation 
is incomplete so long as any are excluded. Sensations are not in the least degree more 
“scientific” than emotions. The problem that faces us now is not “ How can we account for 
these apparently causeless experiences?” but rather “Why is it that science has made so much 
progress with sensations and so little with emotions?” And the answer, I think, is that it is 
because, in the state of knowledge in the seventeenth century, sensations—or, more precisely, 
those experiences which form the subject-matter of the physical sciences—were the only 
experiences for which the necessary rational machinery existed. 

Consider the problem. You have to represent your experiences by concepts defined in 
such a way that the system of relations which follows by rational deduction from the definitions 
will stand in a one-to-one correspondence with the experiences themselves. In this way 
experience is seen not as a chaos but as an ordered whole. Now in the seventeenth century 
by far the most highly developed system of abstract reasoning that existed was pure mathe¬ 
matics. There was Aristotelean logic, it is true, but that was quite inadequate to meet the 
need. The chief purpose it served was to protect the thinker from elementary errors; it 
gave him no impulse to create a system of thought. Accordingly the experiences which 
alone at that time could be treated scientifically were those which could be represented mathe¬ 
matically—that is by number, or magnitude: in other words, the only experiences that could 
be treated scientifically were “measurable” experiences, for measurement is simply an 
operation, precisely defined in all essential details, which yields a number. It was for that 
reason, and that reason alone, that Galileo began with the experience of motion. There are 
other experiences measurable now— e.g . temperature and brightness—but in his day motion 
was almost, if not quite, the only one, and so modern science began with the study of motion. 

There was yet a further limitation. All motions could be measured, but the results of the 
measurements did not always stand in simple relations with one another, and since the ultimate 
aim was to find such relations there was little incentive to waste time on sterile measurements. 



The Nature of Scientific Philosophy 4°7 

The motions of living creatures, for example, showed no detectable regularity, so they were 
left alone. The motions of falling bodies offered a better prospect of success, but even they, 
in their natural form, gave results difficult to relate simply and exactly with one another, so 
the fall was controlled. Not the experiences that came uncalled for, but those that arose in a 
carefully prepared situation, were the experiences with which science began, and it began with 
them because it was not then prepared for dealing with any others, not because others were of 
essentially different nature. 

Both of these features of early science—the restriction to measurement and the “ experimental 
method”, as it is called — have at various times been represented as the essential dis¬ 
tinguishing characteristic of science itself. Important as they are, I think their significance 
is quite misunderstood when they are so represented. They owe their place in science not to 
their being necessary to its nature but to the fact that they provide the easiest—indeed, human 
limitations being what they are, the only possible—starting-point for the process of rational¬ 
ising the whole of experience. The vital essence of science was the realisation that the world 
of matter, space, and time must be ignored and a fresh start made with bare experience itself. 
That having once been realised, a perfectly equipped logician provided with an unlimited 
imagination might have started equally effectively with any kind of experience and covered 
the whole field in any order he wished (I am assuming, of course, that all experience is susceptible 
of rationalisation and is not essentially chaotic). Science would then have known no distinction 
between the measurable and the non-measurable or between experiment and observation. 
Galileo was, of course, not a perfectly equipped logician, but he was an excellent mathematician 
according to the standards of his time. He accordingly selected experiences which could be 
represented by mathematical quantities, and in order that those quantities when determined 
should be such as his mathematics could relate together by simple formulae he altered the con¬ 
ditions of his observation until he had attained the desired end. That is how mathematics and 
experiment gained their present place in science. 

The position to-day is very different from that which Galileo knew. During the years that 
followed the birth of science the metrical process was of course greatly extended, and in biology 
much rationalisation of experience was carried on ip which measurement played no part. In 
our own time psychology has become scientific; that is to say, it has chosen as its field of study 
mental phenomena in isolation from those bodily actions which are the property of physics and 
biology. It has chosen its own concepts—the unconscious , the ego , the censor , and so on— 
and in terms of them it has related together experiences which in earlier days would have been 
rejected as mere illusions, unworthy of a moment’s attention. I think this is an event of 
tremendous significance. It does not matter whether the present conceptions and correlations 
of psychoanalysis are valid or not—indeed, it is difficult to believe that all of them are—and 
we might expect them to have the same degree of permanence as the earliest conceptions of 
electricity or magnetism. What is important is the fact that a science of pure psychology, 
divorced from physiology on the one hand and from pure reason uncontaminated by experience 
on the other, has been shown to be possible, for it confirms the belief that the scientific process 
of starting with experience itself is applicable in even the most difficult fields, and that no 
experience can be regarded as intrinsically outside the scope of the scientific treatment. 

But the greatest hope for the future seems to me to lie in the amazing progress of logic 
during the last hundred years. Where Galileo had only a rudimentary body of mathematics 
and a logic of simple syllogisms with which to face experience, we have an enormously developed 
mathematical octopus and a set of logical calculi in which mathematics itself occupies only a 
subordinate place. Numbers, which are what measurement always yields, are subsumed 
within a far greater corpus of systematised concepts, all of which is available for the scientist of 
imagination to apply to the organisation of those departments of science in which measurement 
fails. The simple idea that experience must conform to the elementary notions with which 
mathematics began is a relic of the prescientific days when the units of the world were thought 
to be material objects. Material objects obey the laws of numbers—no doubt mathematics 
began with numbers for that reason-—and so those laws were thought to comprise the whole 
machinery of scientific reasoning. But even in physics itself this idea has been transcended. 
Non-Euclidean geometry and non-commutative algebra are freely called upon to extend the 
scheme of relations beyond what was possible to Galileo or Newton, and if a two-valued logic 



408 Herbert Dingle 

becomes embarrassed there is a three-valued—or indeed an ^-valued—logic ready to take its 
place. It may or may not be that God made the integers and man made the rest, but it is 
certain that, if so, man can achieve what would otherwise be impossible, by using his own 
creations. It seems to me that in this enormous enlargement of the logical machinery—which 
will doubtless itself be indefinitely extended in the future—lies the possibility of a rationalisation 
of those psychological and psychical phenomena which have so far seemed to be beyond the 
possibility of scientific treatment. What is wanted is the genius to choose the right logical 
system and to correlate its elementary concepts with the appropriate phenomena. When that 
is done we shall know something. 

I would not like to end without saying something about the relation between scientific 
philosophy and what I have called the practical or commonsense view which sees experience 
organised into a world of matter in space and time, but before doing so I must interpolate a 
word about nomenclature, to avoid mere verbal confusion. I have defined the philosopher as 
one whose ambition it is to see the whole of experience as a single rationally connected system, 
and I have conjectured that, in the light of present knowledge, the probabilities are over¬ 
whelmingly in favour of the belief that the scientific approach—that which begins by grouping 
together experiences of the same kind and not those associated with the same material object— 
is best fitted to reach the desired end. That would suggest that science is the right philosophy 
and all others are wrong; and furthermore, that no one whose aim is less universal can be 
called a philosopher at all. Words are merely labels, and the acceptance of these conclusions 
would do no harm if it were not that the words “right” and “wrong” and “philosopher” are 
so familiar with other meanings that misunderstanding would almost inevitably result. We 
do not wish, for example, to deny the title of philosopher to Hume because his work made no 
pretension of illuminating the relation of smelling to hearing, and we do not think it wrong to 
say that the sun will rise at six o’clock because the hypothesis of a rising sun does not lead to 
the most comprehensive system of astronomy. Let me, then, in what follows use the words 
“philosophy” and “philosopher” in the ordinary, rather vague sense, with reference to any 
attempt to make sense of our experience, whether complete or partial; and let me add that I 
shall not call any philosophy right or wrong without reference to its purpose. I would then 
say that if one’s purpose is to achieve the full rationalisation of the whole of experience, the 
scientific philosophy, though still incomplete, is the only one that offers any reasonable prospect 
of success ; but if one has a different object in view, then the choice of a different philosophy 
may be not only legitimate but the only sensible course to adopt. 

Examples of legitimate philosophies are numerous. One thinks, for example, of various 
systems of theology for which men have been ready to die, yet which, with larger experience, 
have been discarded. Their holders have found them indispensable for rationalising their own 
limited religious experience, and it would have been the height of folly for them to have 
exchanged such theologies for a not yet achieved inclusion of religious experience within the 
scientific scheme. The art of science is long and life is short. A man must make up his 
account with the universe in the time allotted to him, and while, if he is a scientist, it is his 
duty to push forward towards the goal he will not live to reach, it is no less his duty and his 
privilege as a common man to make such rationalisation of what he feels to be the most con¬ 
straining elements of his experience as will give him the greatest immediate satisfaction. 
Similarly, art critics may well find that such a concept as beauty—and especially ugliness, 
perhaps—as an objective characteristic of a work of art may be necessary for the clear expression 
of their meaning, notwithstanding that a future inclusion of aesthetics within a wider psycho¬ 
logical science may show the inadmissibility of such a concept. Such examples present no 
difficulty so long as one recognises, on the one hand, that a philosophy must always be evaluated 
in relation to its purpose and to the knowledge available at the time, and, on the other hand, 
that the validity, and even necessity, of a philosophy within a limited sphere carries with it no 
guarantee that that philosophy will survive in a more complete system of correlations. 

But the relation of the scientific philosophy to the everyday philosophy based on a world of 
matter in space and time, though no different in principle, is so much more subtle in detail that 
it calls for special consideration. Even the most uncompromisingly dedicated scientist must 
live an ordinary life, and he must recognise that while, for the most exalted scientific purposes, 
he should see experience organised in the scientific manner, for all other purposes he must see 



409 


The Nature of Scientific Philosophy 

it organised in the commonsense manner. Indeed, the very practice of science itself is a 
commonsense undertaking, and in the simplest electric circuit the physicist, often without 
noticing the incongruity, sees one part as a dance of electrons from atom to atom in an electric 
field and another part as a galvanometer. The galvanometer also, in the scientific sense, is a 
dance of electrons from atom to atom in an electric field, but it would not serve his purpose so 
to consider it; and the conductor which is the centre of interest in his experiment is, in the 
everyday sense, a piece of gross matter, but the experiment would be meaningless if he regarded 
it in that way. If the whole circuit is expressed in terms of the scientific philosophy it becomes 
a subspace in an ^-dimensional manifold represented by certain equations—or by whatever more 
comprehensive generalisation may succeed such a description. This may be in, one sense the 
truest and purest account we can give of it, but the mixed account of the physicist is neverthe¬ 
less the one to be preferred for the purpose of making progress in science. 

A particularly interesting example of the close interconnection of science and common- 
sense is afforded by chemistry. Here we have a study which seems to give the lie to the main 
thesis of this lecture, which is that the essence of scientific philosophy lies in the ignoring of 
material objects and the adoption of the alternative arrangement of experiences into groups to 
each of which corresponds a particular science. For it turns out that when we arrange our 
experiences in this way, all the sensations with which science deals, except those of the move¬ 
ments of living creatures, go to the various branches of physics, and those movements, together 
with all experiences which are not sensations, are claimed by biology and psychology. There 
appears to be nothing left for chemistry. A realist might indeed say that chemistry is the 
science of stinks, and it is true that physics puts in no counterclaim to this not altogether 
felicitous element of our sensibility, but I am afraid that there is no legitimate escape that way. 
Unlike G. K. Chesterton’s character, who played billiards in the dark entirely by the sense of 
smell, the chemist does pay some attention to other aspects of his subject-matter, and what he 
gives us is not an understanding of the olfactory sense but, as it seems, a knowledge of the 
composition of those very material bodies whose existence we said science did not acknowledge. 
How are we to account for this ? 

Well, the truth is that chemistry indeed has no place in the strict scientific scheme, and that 
this is so can be seen from the fact, already evident even at the present stage of scientific 
progress, that the ultimate generalisations of chemistry are all derivable, and indeed must 
inevitably have been reached sooner or later, from the development of physics itself—chiefly 
the departments of optics and electro-magnetism. The periodic table, originally a product 
of chemical research, is a product also of spectroscopic research, and with this difference, that 
instead of showing in each place a chemical symbol and an atomic weight, the spectroscopic 
table shows a configuration of electric charges which, when fully understood, will undoubtedly 
prescribe all the varieties of chemical combination that are possible. The whole of chemistry 
may therefore, so far as final results go, be regarded as a superfluous study. 

I need scarcely say, I hope, that I do not draw from this the moral that the pursuit of 
chemistry has been a grand mistake. Reluctant as I am, and as a loyal physicist should be, 
to say anything good of chemistry, I cannot deny that, quite apart from its necessity for the 
amenities of life, it has been indispensable in making possible the rapid progress of physics. 
Without it the present state of advancement of physics would have been indefinitely postponed. 
But what does follow, I think, is that the part played by chemistry in the growth of science 
has been a pragmatical, a heuristic one. It has provided a short cut to knowledge in principle 
obtainable without it, and so, like all philosophising based on the unanalysed concept of the 
material object or material substance, it is a means of reaching an end, but does not survive 
in the end itself. It is important to understand this distinction, because chemistry has played 
so conspicuous a part in the history of science that otherwise it would seem that an interpretation 
of science which is not largely based on its peculiar character must necessarily be defective. 
That, I am convinced, would be a mistake. Chemistry rightly figures prominently in the 
history of science; in the philosophy of science it should figure not at all. But that very fact 
shows how intimately the scientific and the commonsense philosophy are interwoven. If I 
emphasise the necessity for freeing scientific philosophy from the intrusion of commonsense 
conceptions, it is not in order to depreciate commonsense but because the greater danger 
to-day lies in their confusion. 



4 io Herbert Dingle 

This brings me to another, still more topical, problem—-the status of the so-called social 
sciences. Their position in this country is somewhat ambiguous. They are called sciences, 
yet those who study them are not usually regarded as scientists, they do not come within the 
purview of the Royal Society or any of the ordinary scientific societies, and they have little, if 
any, interaction with the studies whose scientific character is unquestioned. In Germany, 
I believe, they are included with the natural sciences under the general name of Wissen - 
sehaft ; the latter are given the sub-title, Naturwissenschaft , and the former that of Social- 
wissenschaft or Geisteswissenschaft . Hence there also the social sciences seem to be regarded 
as in some sense scientific yet not of the purest blood. Those who feel most strongly that they 
should be considered sciences seem to base their conviction on the belief that “scientific 
method” is applicable to their study, though when we reflect that the method of arriving at 
the principle of least action was to prove that God works in the most economical way, and the 
method of finding a relation between the wave-lengths in the hydrogen spectrum, from which 
the whole of modern atomic theory has proceeded, was to hand the figures to a numerical 
mystic innocent of physical knowledge, we cannot help feeling a little sceptical of a decision on 
grounds of method. Let us look at the question from the point of view taken in this lecture. 

When we do that it becomes immediately obvious that we have a problem very like that of 
chemistry, and susceptible of a similar solution. Instead, however, of the material substance 
as the accepted starting-point, we have the community —a group of human beings, each of 
which is a body and mind. Such a highly complex unit is, of course, much further removed 
from the elementary starting-point of science than is the mere material object of chemistry, 
but that is only a matter of degree; in principle the situations are just the same. Science as 
such knows nothing of the community. The group of experiences that underlies such a 
conception would, in the scientific study, be divided between all the sciences. Physics would 
take those aspects of the body of every member that are exhibited also by inorganic matter, 
and biology and psychology would share the remaining bodily behaviour and the activities of 
the minds. In the course of time, when these sciences have developed sufficiently, we might 
expect to obtain a knowledge of the community which will transcend that obtainable from 
existing social studies in the same degree as the physicist’s periodic table transcends the 
chemist’s. To obtain the fullest knowledge of sociology we must therefore pursue the studies 
of physics, biology, and psychology—especially psychology. 

Even less wisely, however, than with chemistry could we conclude that therefore the social 
sciences might be discarded. We have already almost reached the stage when pure science 
can dispense with chemistry; but the day is not yet in sight when it can do without the in¬ 
dependent study of communities. Both in order to make civilised life possible and also in order 
to accelerate the progress of individual psychology by learning about the behaviour of men in 
the mas% we must prosecute the study of the social sciences with all the means at our command. 
This needs no emphasis. What does need emphasis at the present time is that, despite their 
urgency and indispensability, the social sciences are, from the point of view of pure knowledge, 
ultimately superfluous and intrinsically incapable of yielding anything like a full understanding 
of their own subject-matter. 

The reason why this conclusion, which follows inevitably enough from a clear understanding 
of the character and scope of science, needs to be stressed is that there are prominent and 
influential philosophies in the world to-day which claim special scientific sanction for their own 
social theories, and even in some instances assert that scientific laws are merely special cases of 
their own hypotheses of community behaviour. If I select dialectical materialism for special 
mention it is not because it is unique in this respect, but because it is the most conspicuous 
example of the philosophies I mean, and the one which probably has the greatest influence on 
modem thought. Certain principles are stated to be inevitably operative in community life 
—whether rightly or wrongly is not our present concern—and it is then deduced that those 
principles must operate also in the realms of physics and biology. The fundamental content 
of those realms is taken to be matter, and the conclusion is that the laws of matter must be 
derivable from the laws of human communities, and that all this constitutes science. For 
instance, the melting of ice when it is heated is an example of the law of internal struggle 
which is derived from the behaviour of classes of human beings under changing economic 
conditions. 



The Nature of Scientific Philosophy 411 

This is, of course, completely incompatible with the view which I have been trying to 
present in this lecture. To begin with, the materialistic aspect of the doctrine, exemplified in 
the assumption that physics is the study of matter, is directly at variance with what I hope 
to have shown is the historical fact that physics began to advance when it abandoned the study 
of matter. Again, the method of explaining the simple in terms of the complex is the exact 
contrary of the scientific practice of building up the complex out of the simple. Whatever 
may be said for dialectical materialism as a practical guide to politics—and I repeat that I 
am not here expressing any opinion at all about its merits or demerits in that regard—it is 
crystal clear that as a universal philosophy it is as unscientific as any system of thought could 
be. Its advocates would show less muddleheadedness and do better service to whatever of 
value their philosophy may possess if, instead of claiming that it is what it manifestly is not, 
they would contrast it with science and try to show its superiority. But I do not expect them 
to do so. 

The achievements of the scientific philosophy, begun so modestly more than 300 years ago, 
are now among the most amazing elements of our civilisation and culture. In the advancement 
of pure knowledge—the understanding of the interconnectedness of our experience—scientific 
philosophy has been without a rival. It has pressed into its service the partial practical 
philosophies that serve limited ends, and made them contribute to its advancement. In its 
turn it has enriched them by placing at their disposal knowledge which by themselves they 
would have been powerless to attain, so that we can manipulate the material world in ways 
undreamed of so long as matter was treated as a fundamental entity. Few objects could be 
more worthy of our thought than that of reaching the clearest and fullest understanding of the 
nature of scientific philosophy itself, and I hope that I have been able to contribute something 
towards that end. 


{Issued separately May 20, 1949) 



( 412 ) 


XLIII.— On the Gravitational Mass of a System of Particles. By G. L. Clark, 

Trinity College, Cambridge. Communicated by Sir Edmund Whittaker, F.R.S. 

(MS. received December 20, 1945. Revised MS. received April 30,1946. Read May 6, 1946^ 

Summary 

In classical mechanics the mass of a system of gravitating particles can be defined to be the 
mass of an equivalent particle which gives the same field at great distances, or alternatively the 
mass can be defined by means of Gauss’ Theorem. Reference to the former procedure was 
made by Eddington and Clark (1938) in a discussion on the problem of n bodies. The 
relativistic extension of Gauss’ Theorem has been investigated by Whittaker (1935) for a 
particular form of the line-element and for more general fields by Ruse (1935). The latter, 
treating the problem from a purely geometrical point of view, expressed the integral of the 
normal component of the gravitational force as the sum of two volume integrals. The physical 
significance of one of these integrals was quite obvious but the meaning of the other was far 
from clear. In this paper the terms in Ruse’s result are examined as far as the order m 2 in 
the case of a fundamental observer at rest and the 1938 discussion modified to bring the two 
investigations into line. It is concluded that the surface integral of the normal component 
of the gravitational force taken over an infinite sphere is -4 rrx the energy of the system. 


1. The Field Due to n Bodies 


It is well known that in the case of weak fields it is convenient to introduce in place of the 
gp V quantities /^„, y^ v defined by the equations 

gfxv ** Viiv + /fyw, (1.1) 

Y[XV = kpLV 1 7 }fJLV r f P ^(Tp) ( 1 * 2 ) 

where the rj^ v are the Galilean values of the g^ Vi and to use the convention that Greek indices 
run over the values 1, 2, 3 and 4 while Latin indices take the values 1, 2 and 3. 

Denoting ordinary differentiation by a line followed by a suffix, the linear terms in the 
expression for the energy-tensor are 


provided 


l6'TTTpv — yfXV\*S ” YfJbV ] 44 ? 


(i* 3 ) 


ypLs\s-yiu\i> 


(m) 


The guv of the field having been determined, the track of a body in the field is given by the 
geodesic 


ds 2 


Q dx a ’dxP 


(x-S) 


Considering only linear terms, (1.5) may be written 


d 2 x s 
dfi ~ 


~ + vT Ks\r ~ v r h ir \ s + \h^v s + V r h„ | 4 , 


(1.6) 


where v r are the components of the velocity of the test particle. The form (1.6), given 
originally by de Sitter (1916), was derived by the author as a condition of integrability of the 
field equations in 1941. If the test particle is instantaneously at rest, (1.6) reduces to 

sas i^44l# + ^4s|4* (* • l) 



On the Gravitational Mass of a System of Particles 413 

Two fields will be defined to be equivalent at great distances if a test particle has the same 
acceleration when placed in each field; it would, indeed, be illogical to say that two fields are 
equivalent if they produced different accelerations on a test particle. Now, in the case of 
n bodies, h u is of the order 1 \r at great distances and the derivative h u \ 3 is of the order i/r 2 , 
so for two fields to be equivalent we require the respective ^ 4s)4 to be equal up to order 1/r 2 . 

In the 1938 paper it was explained that it was desired to determine accelerations correctly 
to the squares of the potentials, and accordingly, treating the Newtonian potential as of the 
first order, the order of the terms required to be retained in the gp V are 2, f, 1 in g m g in and 
the other g^ v respectively; and, in particular, /§ 4s | 4 =^ 4s | 4 is of order 2. 

The first approximation to the field due to a system of gravitating particles is given by the 
expressions (3.1) of the 1938 paper; the field, in fact, is 


744 


. 1 
1 


74 n = 4 ^ 


WiVni 


( 1 . 8 ) 


where v li9 v 2i , v u are the components of the velocity of the zth particle and y mn is of order 2. 
As explained above, we do not require the individual y mn but only the second-order terms of 
h u —ly u + ^yn. After some calculation, it was found that to the required order the field due 
to n bodies is 


where 




4 MiVni 


r* 


h u =*y = y 0 + ly Q 2 + £, 


<- 


2 m s 

+ hr ~~7~~ 

n r t 3 A^ 





(i-9) 



and A i5 is the distance between the 2th and/th bodies (the value j-i being omitted in the 
/-summation). 

If we adopt a co-ordinate system in which the centre of mass of the system of n particles is 
at rest at the instant considered, so that E^z^=0, the line-element at great distances is of 
the form 

ds 2 = ( - 1 + y 0 )(dx 2 + dy 2 + dz 2 ) + (1 + y)dt 2 , (1.10) 


provided terms of order ijr 2 in h^ n are neglected. By examining the coefficient of 1 jr in y it 
was shown in the 1938 paper that the mass of the system at an appropriately antedated instant 
is given by 


M** E^ +1 E^zv 5 - E,.E 5 .-~j— 

„ 1 d 2 I 
~ E + 2 di*’ 


(I.! I) 


where 

+ (i.xid) 


M 0 being the sum of the rest masses, K the kinetic energy, O the potential energy and I the 
moment of inertia about the centre of gravity. The purpose of the present paper is to criticise 
the former definition of the mass of a system on the ground that it is only valid in the case 
when ^ 4s | 4 is zero up to order - 2 in r. For, if ^ 4s | 4 is not zero, the jth component of the 
acceleration of a test particle placed at a great distance from a system of n bodies would differ 
from the corresponding component of a test particle placed in the so-called “ equivalent 
field 5 ’ by an amount ^ 4s j 4 . We shall take over the entire 1938 analysis and consider the effect 
of the additional terms arising from ^ 4s | 4 . We conclude, if we retain terms of order that 
a system of particles cannot, in general, be regarded at great distances as equivalent to a 



G. L. Clark 


414 

single particle, and we are led to investigate an alternative line of development by means of 
Gauss’ Theorem. We shall find that the inclusion of 4,| 4 leads to a term - \£ l I\dt 2 which 
cancels out the second term on the right-hand side of (1.11). 

A critic may argue that the discrepancy lies in the fact that the geometrically defined 
centre of mass does not necessarily have zero acceleration. This objection is not valid since 
this phenomenon introduces terms of order m z and not m 2 . In the first place, we have 

r d*T d 2 

- ■- *o ) 2 + (* ^o ) 2 + (* - *o) 2 } 

= "*^'0 ~-&o) * * * } 

= 'Z i m i {(x i x i 4 - - X Q X t + V) + * * • } 

d2 

= JS i m i ^(x i *+y i * + *fl-(F 0 'E i m i x i + . . . ) + + • • • )» (x.12) 


on putting the co-ordinates (x 0) y 0 , z 0 ) of the centre of mass zero after differentiation. Now the 
accelerations x 0 ,y 0 , z 0 and the squares of the velocities x 0 2 , jV? V are at least order m 2 , and 
the terms involving m^ 2 in (1.12) are of the order m 3 at least. Again, in the second 
place, we note that if it were possible to replace the system of n bodies by an equivalent particle 
having the same acceleration as well as the same velocity as the centre of mass of the system, 
the line-element would be of the form 


ds 2 -- 


2 M\ / y 0 T _ _ ov 4 Mv 8 dx*dt I 2M\ 

-1 - — )(dx 2 + dy 2 + dz 2 ) + —-+ ( 1 - — Jd/% 

\ r 0 J 


r o 1 


*0 


and the acceleration of a test particle would be 


M(x* - x 0 8 ) 4 Mv s 4M(x r -x 0 f )vV 




(1.14) 


where (* 0 , y 0 , z 0 ) are the co-ordinates of the centre of mass of the system and 


V = • 


dx 0 r 
dt 9 


dtf 
= dt 9 


r* «(pc - # 0 ) 2 -I- (y ~y 0 ) 2 + (* - # 0 ) 2 . 


(x-* 5 ) 


Since we have chosen a co-ordinate system in which the centre of mass is at rest at the Newtonian 
level, the second and third terms on the right of (1.14) are of the order m z and consequently 
outside the scope of the present discussion. However, we may note that there may be terms of 
order -1 in r in the equations of motion, when we retain terms of order m z . It is clear that 
to the order considered in this discussion we may neglect any acceleration of the geometrically 
defined centre of mass. 

We now proceed to expand y 4<n in powers of ijr. We have 


where 

and 


- (* - *i) 2 + (y -y t ) 2 + 0 - z,) 2 

= r 2 - 2 (xxi +yyt + zz t ) + r 2 i} (1.16) 

r 2 = x 2 + y 2 + z 2 

s \ (*-* 7 ) 

rli^xf+yf + gt*. 


Expanding 1/^ in powers of ijr we obtain 


1 1 xxj+yyj + zzj 3(xxj+;vyj + zZj) 2 Ai , , « 

r t r * + ^ £» + ' • ’ C 1 - 18 ) 

Substituting (1.18) in (x.8) and remembering that at the Newtonian level 'L i m i v ni = o, we find, 
considering terms up to order - 2 in r, 

4 

Yin = ^.xl, i m i v ni x i +yl l{ m i v ni y i + zl li m i v ni z { }, (1.19) 



that is, 


On the Gravitational Mass of a System of Particles 

Yu +y^i m zyr^ +s ^i z ^J> 


415 


(1.20) 


and similar expressions for y 42 and y 43 . 

At this point it is instructive to consider a double star having components of equal mass. 
Putting m x = m % in equation (8.2) of the 1938 paper, it is seen that the centre of mass has no 
acceleration of the order m 2 . On the other hand, we may choose co-ordinates so that to a 
first approximation, 


(1.21) 


where co is constant, a is the distance between the bodies, and aco is related to the mass m of 
one of the components of the double star by means of 


x x = \a cos cot, 

y x = \a sin cot, 

*1=0, 

x 2 = - i cl cos cot, 

y 2 — -\a sin cot, 

*2 = 0, 

£ x = - \aco sin cot, 

y x — \aco cos cot, 

*1=0, 

x 2 = \aco sin cot, 

y% — - \aco cos cot, 

*2 = 0, 


Substituting (1.21) in (1.19) we obtain 

md 1 co c 

741 = { ” x s ^ n 2cot + ( cos 2ajt ” I )> , }> 

ma 2 oo f t 

y 42 = ——{#(1 + cos 2cot) +y sm 200/}, 

743 = 0. 

According to the 1938 discussion the “equivalent mass” of the double star is given by (1.11), 
namely 


(1.22) 


(1-23) 


M— 2 m —- 


m* 

= 2 m -—, 
2a 


(1.24) 


since 


K=\ma 2 co 2 , Q = 


a 


The motion of a particle at great distances is, by (1.7) and (1.23), 
d 2 x Mx 

— ■-— (- x cos 2 <at-y sm 2 cot), 

d 2 y My 2ma 2 co 2 , 

— = “““^3“ + —^3—(-# sm 2 cot+y cos 2cot), > 

d 2 z Mz 

~dt 2 ~ 


(1-25) 


where M is given by (1.24). 

Since the centre of mass of the system has no acceleration up to order m 2 , it is obvious 
that the periodic terms in (1.25) are not due to any motion of the centre of gravity. Moreover, 
if we use present instead of retarded values and retain the term 2 mf in (1.9), we find that 
entirely new terms appear in (1.25). The periodic terms in (1.25) are not therefore connected 
in any way with the fact that retarded potentials are used in evaluating M. As a consequence 
of these facts we are led to the conviction that the field due to two bodies of equal mass cannot 
be regarded as equivalent to the field due to a single particle at great distances if terms of 
order m 2 are retained. As a matter of fact the potentials (1.23) are of precisely the same form 
as potentials due to a rotating cohesive system such as a rotating rod, but the equations (1.25) 
are not valid for the latter system owing to the presence of additional periodic terms of order 
1 jr in h u . 



4i6 


G. L. Clark 


In general it is not possible to express the potentials in forms which are valid for all time, 
but we can usually obtain the instantaneous values of the potentials. For example, in the 
case of a rigid continuous system rotating with angular velocities (w 1; w 2 , oi 3 ) about the axes, 
we may write 

VM^OOsXi-CJtfi, J- (1.26) 


and, if A, B, C, F, G, H denote the moments and products of inertia, (1.19) becomes 


+y[ Fo> 


C+A-B 


'0 )3 ) + ' 


fA+B-C 


■oj 2 - Fa> s 


I}. 


(1.27) 


and similar expressions for y 42 and y 43 . 

If the moments and products of inertia are expressed in terms of the principal moments of 
inertia referred to axes moving with the body and instantaneously coinciding with the co¬ 
ordinate axes, we have [Clark, 1941, equation (7.4)] 


Geo 2 -(A-C )zzr 2 2 /, 
Eoj z = ( 2 ? - A)w z 2 t } - 

aj s — vj s + w s t, 


(r.28) 


where m s , w s are the instantaneous values of the components of angular velocity and accelera¬ 
tion, and cubes and higher powers of m 8 are neglected. The angular acceleration is given by 
Euler’s equations and is consequently of the order w 2 . On substituting (1.28) in (1.27) we 
obtain 


Yu = “^*[(-4 ~ C)m£t + {A- +y |V - - - - - -- — (ro 3 + w 3 /f)"| 


f A+B-C 

+ 2 


L 


(flT 3 + ro 3 /) — (C —^)rojW 2 /j| (1.29) 


■} 


The component of the acceleration in the direction of the a?-axis of a test particle placed at a 
great distance from the system is accordingly 

d 2 x Mx ax. n Ayf C+A-B 

4 *(A+B-C. 1 

+ (1.30) 

where, as in (1.25), M denotes the “ equivalent mass ” in the 1938 sense. 

We conclude this part of the discussion by remarking that it is evident from (1.20) that 
the conditions that (i.to) should represent the system are: 

dx, _ dxi _ dx t 




'dt 

1 dt 
dz, 

*Tt 


= 'I H m ( yQj = = °, V 


y d%j dz$ dz. 


Hence 


dt 


= 0. 




x d 2 I id 2 2 d / ^ ^ 

2 dF 2 dt^X'Tt +yr di +Z{ di l 


and consequently (i.ii) reduces to 


M=E. 


(1.32) 

(1.33) 



4*7 


On the Gravitational Mass of a System of Particles 

The result (1.32) also holds in the case of a rigid system rotating about an axis. That is, 
it applies when terms of the type (1.29) are added to the line-element (1.10). To investigate 
the effect of the terms (1.19) in the general case, it is necessary to consider the problem from 
the point of view of Gauss’ Theorem. 


2. Gauss’ Theorem 
Taking a line-element of the form 

ds 2 * g M dt 2 + g mn dx m dx n 9 (2.1) 

where g u may be a function of all four co-ordinates and g mn functions of x s only, Whittaker 
(1935) showed that the quantity 

4 ~ 37 - 2 V - Tft V( -g) (2.2) 

could be expressed as a divergence 

(V( -<?V 4 {44, 4)l» ( 2 -3) 

where the Christoffel bracket {44, s} reduces to for the metric (2.1). Integrating 

throughout the volume considered he obtained Gauss’ Theorem in the form 

Jjf (V( -^44, s}\dxdydz = 47 rj|J(n 4 - ^i 1 - T? - ~g)dxdydz 

= 4 7r|JJ(r- 2 T? - 2 Ti - 2T^( - g)dxdydz, (2.4) 
which, in the case of weak fields, may be written 

JJJ( - P 4 i|«)| sdxdydz = - 47rj||(r+ 2 T u )^( ~g)dxdydz. (2.5) 

Now the invariant mass, m 9 of a single particle is given by 


and so 


-ijj™ - gYjdxdydz, 

JV( - g)dxdydz * \mv % + \mh^ 


Summing over all particles, we get 


jjjVV( -g)dxdydz~M 0 - X+ 20, 


where, as in (1.1x0), M 0 is the sum of the rest masses, K the kinetic energy and O the potentia 
energy. 

We also may write 

T mn ~pv m v n +j> mm (2.8) 

where p mn represent the stresses. In a subsequent paper we shall derive the equation 


^PudV~ o. 


Using the result (2.9), we have to the order considered 


s HI PnV( ~ g)dxdydz = 4 K. 


Substituting (2.7) and (2.10) in (2.5), we see that the volume integral on the right-hand side 
of the equation has the value 

-47r(2^+ 3iT+2Q)== -47 + (2.11) 


See p. 432, equ. (5.8). 




G . X. Clark 


418 

on using the relation [Eddington, 1916, equation (4)] 

^ 1 d 2 I 
2 Xr+0 = - -rz 
2 dt 2 


(2.12) 


for a system of particles. The result (2.11) is identical with (i.xx) and the two methods of 
analysis are equivalent. However, as already mentioned, the term d 2 Ijdfi is zero for rigid 
continuous systems. 

We now proceed to consider the additional terms appearing in (2.4) when the line-element 
is of the general form 

ds 2 = g fZV dx? l dx v . (2.13) 

Throughout the discussion we have restricted ourselves to the consideration of terms up to 
order m 2 > and consequently the only contribution to (2.11) from the squares of the potentials 
kpw is 

- 8ttQ. = - ^it^Th^dV. 

In passing over to the general form (2.13) we naturally retain this term, but the additional 
terms all involve h^ v linearly and the investigation is therefore quite straightforward. 


3. The Gravitational Force 

The linear terms in the potentials h^ v in the equations of motion of a test particle are given 
in (1.6). Taking the test particle to be of unit mass, we see that the expression on the right- 
hand side of this equation is the gravitational force X s , and if the test particle is at rest, at the 
instant considered the gravitational force is simply 

X 5 — ”i^44!s + ^4sl4> (3* 1 ) 

and the divergence of F s is 

X 5 ! 5 = — i^44|ss 4* ^4sls4* (3- 2 ) 

Introducing the y by means of (1.2), this divergence may be written 

X*I s = “ i(744|s5 ” 744|44) ~ i(7li|»« ~ 7 ZZI 44 ) + ^4s|s4 ~ i^44|44* (3*3) 

If the condition (1.4) is satisfied, (3.3) becomes 

X% = - 47 t ( T + T zi) 4 - ^4j| 34 - i^44|44 ( 3 * 4 ) 

on using (1.3). As we have seen, y mn is of the second order and its derivative y mn \ u is of the 
third order; we may therefore express (3.4), with the aid of (1.4), in the form 

= - 4 ^X 44 + Tii) + 7 4 sjs4 “ 4744144 (3*S) 

= ~ 4 ^(X 4-2 Tn) 4 - I 74 s|s 4 j (3*6) 

since to the order considered 

T— T u - T11. ( 3 * 7 ) 

As far as the linear terms are concerned, Gauss’ Theorem accordingly takes the form 

= -^(T+a^dV+^y^dV, (3.8) 

where n s is the outward normal and F s is given by (3.1). 

Comparing (3.8) with (2.6) we see that the effect of the additional potential y in in the line- 
element is to increase the gravitational force by ^4^4 and to introduce a further volume integral 

ijy^dV. Applying the relations (2.8) and (2.10) to the first integral on the right-hand side 
of (3.8), we find that its value is 


-477(2^ + 3^. 


(3*9) 



419 


On the Gravitational Mass of a System of Particles 

We mentioned at the end of the previous section that the contribution - 877Q arising from the 
squares of the potentials h^ v must be added to (3.9); the sum 

- 4tt(S^ + 3K + 2Q) (3*10) 

being in agreement with (2.10) and (1.11). Moreover, the form of the equation (1.3) indicates 
that retarded values must be used in calculating the potentials. 

In a rigorous discussion, covering the case of strong fields, Ruse (1935, p. 151) expresses 
Gauss’ Theorem in a form similar to (3.8). One integral gives immediately the expression 
(3.10) for the system considered in this paper. The integral corresponding to the second 
integral on the right-hand side of (3.8) is 

-\m P ^) a +x^)^}dv, (3.11) 


where A a is the unit 4 vector of the fundamental observer and the brackets ( ) denote covariant 
differentiation. To the order of magnitude considered in this paper we may take 


A* = o, A 4 = i +%h u ,\ 
A* = o, A 4 = i-P m . I 


(3-12) 


The only terms of order m 2 in (3.11) come from - X a (XP)p a , namely 

-A«g{AV(-.?)}= -f 2 [(i -P44X1 -tai+i* 44)] 

= PzZ |44 

“ f 744144 

“ f 74 s1s4j (3-13) 

in agreement with (3.8). 

Since is in the form of a divergence, the volume integral J y^\^dV can be evaluated 
by means of a surface integral; that is, (3.8) may be written in the form 

- 4 7r||(r+ 2 T ll )dV+^TA ii dV^+^(n sYis]si )dS (3.14) 

on including the quantity 

-8ttQ,= -2tt jntudV. 

Now if we integrate first through regions containing matter only and then throughout 
empty space, the surface integrals, evaluated at the boundaries, will cancel out in pairs since 
the potentials and their first derivatives are continuous. Consequently, when integrating 
throughout all space we need only evaluate the surface integrals over an infinite sphere and, 
in the case of n bodies, we may use the approximate value of y 4n given by (1.19). We then 
have 

4 d 

74n|4 = ^ ( 3 .I 5 ) 


Also the direction cosines of the outward normal for a sphere are X s ]r, and so 

J(« s r4s!4)^=}7r4si4^‘S'- ’ ( 3 - 16 ) 

From (3.15) and (3.16) we see that the integral consists of partial integrals of the type 



420 


G . L. Clark 


This is zero unless s=r, in which case we have 

~\x*dS = ^dS = ^z*dS= A ^- 
With the help of (3.17) we then obtain from (3.16) the result 

fj (n s y is \^dS = J ry i v^ i + 2*034)} 


where r oi is defined by (x.17). 

Combining (3.18) and (2.11), we finally obtain Gauss’ Theorem in the form 


J(V r V‘S= -4 vE, 


as the forms in d 2 Ijdt 2 cancel. The result (3.19) may be stated thus: “The outward flux of 
the gravitational force due to several moving particles taken over an infinite sphere is equal 
to - 4-2T x the energy of the system ” 

In the above analysis we expressed the volume integral as a surface integral 

which we evaluated over an infinite sphere. It may be thought that a more detailed investi¬ 
gation would show that the only contributions to the volume integral come from regions 
containing matter. This does not appear to be the case; in fact, it seems that 

f 

IJ Yu\* d ( 3 - 2 °) 

when integration is over the part of space containing no matter, and 


f V = 4w2i7»40i s 


when integration is over the part of space containing matter. The sum of (3.20) and (3.21) 
gives 

f d * * 

l)Yt>\*<Zy= (3- 22 ) 

as in (3.18), integration now being over all space. 

For the sake of illustration let us suppose the ith body to be spherical, of constant density 
and of small radius a u which can be made to vanish in the limit. The part of y 4s | s4 due to 
the other particles will accordingly not contribute to 


fj Y*A* dV ' 


ith particle 


The internal potential y^ for the zth particle is simply 

(rJint=“ tt"( 3^ 2 - ^ 2 ), 

for the expression (3.24) and its derivatives is continuous at r t —with 


(y 4 s)ezt “ " 


Substituting (3.24) in (3.23) gives 


f y4 S | S 4^^=4f«i0i a , 


ith particle 



421 


On the Gravitational Mass of a System of Particles 
in agreement with (3.21). The result (3.26) can be verified by considering the integral 


extended over the surface of the ith particle and using either the internal or external form of 
the potential y 4s . Further insight into the theory is obtained by writing (3.20) and (3.21) in 
the forms 


ijy*i*<* =477Q 

empty space 

fj 745154 ^^= 4 ^( 2 ^). 


The formula (3.19) may then be written 


J(^V*S= -4 iKEt+EJ, 


where E { , E 0 are respectively the contributions from regions containing matter and empty 
space. The actual values of E { and E 0 are found from (3.10), (3.27) and (3.28); they are: 

E { = M 0 + 3X + 2Q - 'iK 

=M 0 + X+ 2Q (3.30) 


3 ^ + dt 2 9 


E 0 =-Q 

id 2 / 
= 2 K- --7T, 
2 dt 2 


( 3 - 30 *) 


where M 0 is the sum of the rest masses, and the total energy is 

E = E{ + Eq 

=M 0 +E+a 

„ „ I d 2 I 

-M 0 E+ 2 

For steady systems, for which d 2 I\dt 2 is zero, 

E=M*-K. 


4. Isolated Systems 

Throughout the preceding analysis we have restricted the investigation to linear motion, 
and the results we have obtained depend on equation (2.9). In the case of a rotating cohesive 
system, the analysis from which this equation was deduced no longer applies and the result 
{2.9) no longer holds. I have elsewhere (1946) calculated the gravitational field of a slowly 
rotating nearly spherical ellipsoid. For this system, I have found that 


That is, 


jr n dv=* o. 

jj>udV= - jpz> s v s dV 
= - 2 Kr. 


We may regard the integral on the left-hand side of (4.2) as being a rotational potential 
energy Qr; the equation (4.2) then expresses the fact that 

“2 Kr 


(4-3) 



G. L. Clark 


422 

and is analogous to equation (2.12) for systems (as in this case) for which d 2 I\dt 2 
is zero. 

The equations (3.20) and (3.21) and therefore (3-30) and (3.31) do not apply to this system. 
I have substituted the potential y 4s , for the internal and external fields due to the rotating 

system, in the volume integral jIt is found that the integral vanishes when integrated 

in the region containing matter and also when integrated throughout empty space. That is, 

£ 0 = o, (4.4) 

and 

^-Jf 0 +3*i + 2GjB 

=M 0 -K Ry ( 4 - 5 ) 

on using (4.3). 

A general system will consist of several bodies having both linear and angular velocities. 
The total kinetic and potential energies will be given by 




(4.6) 


fi—fix+£2 r, 


( 4 - 7 ) 


where Kl is the sum of the kinetic energies of the linear motion of each body, K R is the sum 
of the kinetic energies of the rotational motion of each body relative to its centre of gravity, 
Ox is the ordinary potential energy and Or is the rotational energy defined by 

Qr— \pudV. (4.8) 

From (2.12) and (4.3) we have 


and 
so that 


^ ^ 1 d 2 I 

®‘ L ~ 2K L + 2 dt i 

Q.R= -2Kr, 


_ I d*I 

-2 K+- — 2 - 
2 dt 2 


Also from (3.30), (3.31), (4.4) and (4.5) we have, on using (4.9)-(4.n), 

- Ox, 

E>i = Mq + Kl 4 2Qx 4 3 K R + 2 Cl R 
— Mq 4- K L 4 2 Qx 4 K R 4 fix 


and 


d 2 I 

~M 0 - 3 K Z -K R +—, 

K = Mq + K l 4 fix + K R 4 fis 
= Mq 4 K 4 fi 

w 1 d 2 I 
-M 0 -X+-—. 


( 4 - 9 ) 

(4.10) 

(4.11) 

(4.12) 

( 4 -i 3 ) 

( 4 -i 4 ) 

( 4 -iS> 

(4.16) 


In the case of steady systems the expressions (4.15) and (4.16) link up with an alternative line 
of development given by Eddington in his Dublin Lectures (1943, sections 11 and 13). In 
these lectures Eddington partitions the total energy-tensor into a particle energy-tensor 
dCfiv and a field energy-tensor fi^ by means of 

( 4 -i 7 ) 

and ^ v ' * (4 ’ 18 ^ 


Tp V — 4- 

Qpv— -(k + i)Kp Vf 


iv — 


-kK 


flVi 


( 4 * 19 ) 



4^3 


On the Gravitational Mass of a System of Particles 

where the factor k represents the number of independent components of the energy-tensor 
and k = i for particles considered in molar relativity theory. Putting k= i in (4.18) and (4.19) 
we obtain the results (4.11) and (4.16) with d z I/dt 2 zero. The expressions (4.12) and (4.13) 
assert, moreover, that in empty space the field energy is and in regions containing matter 
it is 2Q. l +Q<r, the total field energy being +Qr. 

When the system consists of a single body = o. In the absence of rotation, we then have 

£ 0 =o, (4-20) 

£=jE i =M 0 + P: L , (4.21) 

and there is no inversion of energy. On the other hand when the isolated particle has rotational 
as well as linear motion, 

E = Mq + Kj^ + Kr + Qa 

+ (4.22) 

That is, there is inversion of rotational energy but not of the energy of the linear motion. In 
conclusion, we remark that the result (4.20) follows directly from (3.20), since the acceleration 
of an isolated particle is zero. 


REFERENCES TO LITERATURE 

Clark, G. L., 1941. Proc .. Roy . Soc., A, clxxvii, 227-250. 

-, 1946. Phil. Mag., Series 7, XXXIX, 747-778. 

Eddington, A. S., 1943. The Combination of Relativity Theory and Quantum Theory, Dublin. 
Eddington, A. S., and Clark, G. L., 1938. P'roc. Roy. Soc., A, clxvi, 465 - 475 - 
Ruse, H. S., 1935. Proc. Edin. Math. Soc., (2), iv, 144-158. 

Sitter, W. de, 1916. Mon. Not. R. Astr. Soc., lxxvii, I 55 - I 84 - 
Whittaker, E. T., 1935. Proc. Roy. Soc., A, cxlix, 384 - 395 - 


{Issued separately May 20, 1949) 



( 424 ) 


XLIV.— The Equivalence of the Gravitational and Invariant Mass of an Isolated 
Body at Rest.* By G. L. Clark, Trinity College, Cambridge. Communicated 
by Sir Edmund Whittaker, F.R.S. 

(MS. received May 22, 1946) 

Throughout the preceding discussion we have considered only those contributions to the 
stresses which are due to either the motion of the body or the presence of other bodies. That 
is, we have not considered the stresses which occur in the systems discussed by Whittaker, and 
we have, in effect, assumed that the gravitational mass of an isolated body at rest is the same 
as its invariant mass at least as terms of order m 1 are concerned. In this appendix we complete 
the investigation by demonstrating the validity of this assumption. 

Retaining terms up to order m 2 only, Whittaker’s analysis gives 

M=jr v / (-g)dF- 2 j Tidv, 

since T z l is of order m*. 

Now the invariant mass M Q is given by 

M*-\liVi-g)dV 

=\TV(-g)dV-^TA u dV. 

That is, 

\TV(-g)dV=M 0 -\<l>dm, 


where (f>= — \h u is the Newtonian potential and dm is an element of mass of the body. The 
gravitational energy Q s of a body is given by (see, for example, Ramsey, 1940, p. 57) 


j§ 

H<m 

1 

li 

cf 

(4) 

The equation (3) may therefore be written 


^TV(-g)dV=M 0 + 2d s . 

(s) 

We shall also prove that 


\thv- -^T u dv=n s . 

(6) 

Substituting (5) and (6) in (1) then gives 


M=M 0 . 

( 7 ) 

From equations (12.8), (12.9) and (12.10) of Clark (1941, p. 249) we have 


l67rT ll * 7 W|ss “ 17441 * 7441 * ““ 7447441 s* 

(8) 

Cyil ~~ |ss ^ 744744 |ssj 

( 9 ) 

or 



(10) 


-tv * Jkjf * s t0 be regarded as an appendix to the paper, “On the Gravitational Mass of a System of 
Particles”, p. 412 supra. r * j 


(X) 

(2) 

(3) 



Gravitational and Invariant Mass of an Isolated Body at Rest 

425 

since 



and 

<£= -£^ 44 = -iru 

(II) 


i6ttp = 7uUs 

(12) 

correct to order m. 



From (10), we have 

\r n dV= ^(yu-iytiusdV-a, 

(13) 


on using (4). 

The first volume integral on the right-hand side of (13) may be expressed as a surface 
integral. Also since the energy-tensor is zero in empty space and the y^ v and y^ are 
continuous at the boundary of the body, we may integrate the volume integrals throughout 
all space and the surface integral over an infinite sphere. Now since y % 4 and y n are both of 
order «- 2 in r, the surface integral is of the order -1 in r and accordingly vanishes. In any 
case, at sufficiently great distances, the field will be the same as that due to an equivalent 
spherical particle, and the actual expressions for y\ 4 and y u will be 


2 1 6 m 2 

r z > 


(14) 

7 m 2 x m x n 

7 » 2 2 

(is) 

Ymn ~~ ? 



The result (15) is taken from the expression (unnumbered) for y mn on p. 242 of Clark (1941), 
and from equation (13.3), p. 97, of the Einstein, Infeld and Hoffmann paper (1938). 

For a sphere we have therefore 

Yu-irli^ 0 (16) 

for all values of r. Consequently 

(yn-iYii) l«=o, (i7) 

and the surface integral vanishes when evaluated over a sphere of finite radius. On using 
(17), the equation (13) takes the form 

fr u jr= -Q S) 

which is equation (6). 

It is interesting to verify the result (6) in the particular case of a sphere of constant density 
and radius a . The Newtonian potential is given by 

m 

4> = ~s^3 a2 - ri )> ( l8 ) 

and consequently 



The line-element for this system has been given by Schwarzschild (see, for example, Eddington, 
1924, para. 72). This solution gives isotropic hydrostatic pressure at every point To the 
required order of approximation, the pressure is 



426 Equivalence of Gravitational and Invariant Mass of an Isolated Body at Rest 

The expression (20) is obtained from Eddington’s equation (72.4) on writing a * am/a*. Now 
from (20) we have 

-[mv-for-vt ( 21 ) 

Combining (19) and (21) we then have 

- (22) 

in agreement with (6). 

ADDITIONAL REFERENCES 

Eddington, A. S., 1924. The Mathematical Theory of Relativity , Cambridge. 

Ramsey, A. $., 1940. An Introduction to the Theory of Newtonian Attraction, Camb. Univ. Press. 


(.Issued separately May 20, 1949) 



( 427 ) 


XLV.— The Internal and External Fields of a Particle in a Gravitational Field. 
By G. L. Clark, Trinity College, Cambridge. Communicated by Sir Edmund 
Whittaker, F.R.S. 

(MS. received April 30, 1946. Revised MS. received April 30, 1946) 

Summary 

The gravitational field of a system of particles was investigated by de Sitter as far back as 
1916. A minor alteration to the analysis was made by Eddington and Clark in 1938. The 
amended value of the potential g u is the same as that derived by Einstein, Infeld and Hoffmann 
without making use of the energy-tensor; this agreement suggests that the revised de Sitter 
argument is correct. In this paper we show that this is not the case, for the de Sitter analysis 
completely overlooked any possible interaction terms in the stress components of the energy- 
tensor. We find the value of these terms ,p mn , and show that the agreement mentioned above 
is due to the fact that the volume integral of pn vanishes. 


1. Introduction 


In this paper we adopt the notation and conventions used in the previous paper. We also 
restrict ourselves to fields satisfying the co-ordinate conditions 

7/«i*-y^i4=o- (i-i) 

For gravitational systems the terms of order m are of the form 

744 = 2 ^, (l.2) 

where -(f> is the Newtonian potential and 

y mn =o. (1.3) 

A tedious calculation gives the values of the components of the energy-tensor correct 

to m\ 

1 =y44|ss+6#| ss +i$| a ^| i0 (m) 

and 

T-^7rT mn ^= : y mn \ S8 + 2<jxf>\ mn — 2^g|<£{ s 8 mn + <f>\m$\n ~ ( I *5) 


The expression (1.5) is essentially (12.8) of Clark (1941). From (1.5) we have 

i6ir7 , !l =y»|s s -4#| ss -#| s ^! s - (i-6) 

<f>=U+hy'u, (i-7) 

where - %y u is the Newtonian potential due to the particle, which we shall assume is spherical, 
and U is the potential due to the other bodies. We take 


We now put 


744= — 


4 m 


(1.8) 


and 


744 = Ti"( - 3 fl2 + *■*)» 0-9) 

CL 

where m is a constant, for the external and internal fields respectively. Moreover, in the 



428 G. L, Clark 

neighbourhood of the particle, we may write 

U—U+x s t }|, + . . (1.10) 

where tJ 9 tJ\ 8 , ... are the values of U, U\ s , ... at the centre of mass of the particle which, 
for convenience, we have taken to be at the origin. In the Einstein paper and in a similar 
discussion given by Clark (1941, paras. 8-10) only those terms involving U\ s , which were 
necessary to obtain the equations of motion, were considered. The present analysis requires 
us to consider all terms containing tJ\ s but not U\ mn , U \ mnv ,... We may note also that 
another minor difference between this and the previous investigations is due to the use of the 


co-ordinate condition 

7ms\s - 74*7214 = ° 3 (l.Il) 

instead of 

yms|s=°- (1.12) 

By making use of <1.7) and (1.10), the expressions (1.4), (1.5) and (1.6) may be written, in 

the neighbourhood of the particle, 

1 671^44= y u \ ss + 3 tJy u | w + 3 V\^y u] „ +f U\ s y f u j „ (1.13) 

J 6 7 r T mn *=y mn \ ss + Uy ulmn — ’Cy±±\ ss ^> mn + ^\s xS yu\mn ~ 

+ + \fi\nYu\m ~ 14) 

and 

i6-nTii=y lAss - 2 Vy' u]l!S - 2V ls x 3 y' iMrr - 0 \ s y' ii]s . (1.15) 


2. The External Field in ti!e Neighbourhood of the Particle 
Substituting ( 1 . 8 ) in ( 1 . 13 X 1 . 15 ) gives 

_ j yyiX? 

j67tT ’14 = 744183 + 6U\ 

1 & 7 rT mn =y m „|ss + 4 ~ ~£mn - 


and 


i6ttT u = y lllss - 14 tl\~- 




„ mx _ 

+ 2 U\n~Z- + 2 1 U\ 


mx 11 


J \n r Z r z ) 


The condition for free space requires us to take solutions 

a ^ m a-Fr mx * -n mx ? 

v*=~+PV\'-ir-3V u —, 

« rt f 1 * x m x n \ _ mx s x m x n mx 8 


where a and j8 are constants. 
From (2.5) we have 


and 


2m r 

ymsts- r U\ m , 


X? _ X s 

Yu -—- 7 m%- - ma 2 V u —. 


(2.t) 

(2.2) 

(2-3) 

(2.4) 


ma*( x? _ x n X 7 * \ 

+ ~{~^P\^mn+^tJ lm +-^ 17 \ n ), (2.5) 


( 2 . 6 ) 


r 


(2.7) 



The Internal and External Fields of a Particle in a Gravitational Field 429 

The equations (1.4) and (1.6) can be integrated in forms which are valid for all space. The 
solutions, apart from additive harmonic functions, are 

744 = ( 2 - 8 ) 

and 

Vn=*l<f> 2 - ( 2 - 9 ) 

The condition that the expressions which are valid for all space should reduce to the values 
(2.4) and (2.7) in the neighbourhood of the particle fixes the value of the additive harmonic 
functions which appear in (2.8) and (2.9). The result is 

TJ m jjjS 

744 = -1 ft + (« - 3 )— + ^1^ (2-8®) 

and 

, Um n - x 5 

yn =t — U\s^- (2.9a) 

The values of a and j8 in (2.4) are fixed by the condition 

7451*-74414 = °‘ ( 2 - IO > 

It is found [Clark, 1941, equations (9.4) and (9.9)] that a=4 and /?=o. We have therefore, 

in free space, 

^44 = 1744+l7ZI 

Um -a? 

-ft-—~ ml! \^z ( 2 - JI ) 


Putting m = nit, r = r { and £ 7 = -2^-—, the second term on the right-hand side of (2.11) 
may be written ij 

2 m j 
r i s Au 

in agreement with previous calculations. The third term, which is of the order — 2 in r, is 
new; its appearance is due to the fact that we have considered terms of order - 3 in r in y ws !s- 
Terms of this order were not included in the 1941 and earlier discussions. 

3. The Internal Field 

We now have to find potentials which are finite at the origin and which, with their derivatives, 
are continuous at the boundary r—a. We impose the further restriction that the co-ordinate 
conditions (1.1) are satisfied. After some calculation we find that the solutions are 

744=“F <30 2 “ **) + - ir\ (3-i> 

Vmn—~r( - 5« a + 3^)x m x n + ^ + ioaV 2 - -IjB mn 

- + ^^^(_ 5a2 + 3 ^ + ^ ( _ ¥ a 4 + ^_ 3 ^ ran 


+ + U {n x m )(W - f£aV 2 +1?- 4 ). 


From (3.2) we have 


mt 7 

y?nsl 5 “ -3 1»( ~ 3 ° + r )* 


6mt 7 . 

Yu\*s- ~ _3 “ ,,3 u \s??> 



430 

G. L. Clark 


and 

7 ««lss = ^(S°a 2 S mn - &4r 2 8 mn + 42x m x n ) 

d 



7 YI 

+-^V\ s x s {(6Sa 2 - Sir 2 )S mn + 2jx m x n } 

7 ft ^ ^ 

+ —s(x m U\ n +x n V\ m )(-2.ia i + 2V 2 ). 

( 3 - 5 ) 

The contributions from the quadratic terms in (1.13) and (1.14) are respectively 


and 

36mU 42m- 

( 3 - 6 ) 


-^C 8 mn - +^(V 1n x m + 

( 3 - 7 ) 

From (3-4)—(3.7) we obtain 


and 

Tim 

Am ~ SiTa!^ 210 * ~ 42 ^ S ™« + 

( 3 - 8 ) 


+ i67r^ l ^ 2ai ~ 3 r 2 ) S m» +x m x n } 

+ 3 L^ xm °\n+x n V\ m )(r*-* 2 ')- 

( 3 - 9 ) 


4 * Terms involving Velocity and Acceleration 

We next turn our attention to the linear terms involving the velocity and acceleration. 
The external field is given by 

A m 2mv s v s 

744 = -— - 2mf - ■ 

r r 

4 m 2 m X s v s Amtfv 8 2 mx s x r v s v r 
= - T +—-+-s-, U.i) 


Ymn -‘ 


4 mv m v n 


where a dot denotes differentiation with respect to the time. The potentials (4.1H4.3) satisfy 


-74414 = 0 


Yimli Ypisls — r - (4.5) 

ST" ^ " theprCTi0 “- ■“ 

[»-o. 


(4.6) 



The Internal and External Fields of a Particle in a Gravitational Field 431 
The corresponding internal field, satisfying the boundary conditions at r=a, is 
2m, mv s x s / 2mv 2 mx s x r 

Yu = ~ ~ r ) + —“a—( 3 ^ a ~^ ) + —s~(“ 3 « + **) + —T~(sa 2 -&*)•&&, (4.7) 


Yin=^(3* 2 -r*)v n , 


2m, n 

Ymn = - ~^(3 a - r 2 )v m v n . 

The expressions (4.7)—(4-9) satisfy (4.4) and 


2m, 


Yim 14 “ Yms | S = ~ r*)V m . 

Hence, by ( 3 . 3 ), we have on including the interaction terms, 

2 m _ 

y^m\i ~~ Ym$\$ “ _3 (3 122 “ r 2 )(v m + \U j TO ). 


The expression on the right-hand side of (4.11) vanishes if (4.6) is satisfied. 
To the order considered, the energy-tensor is given by 

iGTrTp'v — y^ss ~~ Yfiv |44* 


(4.8) 

(4*9) 

(4.10) 

(4.11) 

(4.12) 


The external field given by (4.i)-(4.3) gives a zero energy-tensor. The internal solution 
gives 

3 1 r 2 v 2 jx s x r v s v r \ 

- 7*-) 


T u = 1 - -d*#* + -v' 

47 ra 6 \ 2 2 2 & 


.2 




3 ml 1 3*. 1 7 

-±—( 1+ -U ]s x s + -v 2 - r ---— , 

47t<2 3 \ 4^2 2 a 2 2 a 2 ) 

3 m 


47 ra° 


-zf 1 


„ 3m 

r mn —— iv m v n . 

mn __s ? 

4770 ° 


(4-13) 

(4-i3«) 

(4.14) 

(4.1s) 


where zf 2 = ^z> s and (4.130) is derived from (4.13) by means of (4.6). 


5. The Complete Energy-Tensor 

The complete energy-tensor is obtained by adding (3.8) and (3.9) to (4.130) and (4.15) 
respectively. The result is 


where 


^ , . 3 , 7x*xWv r . 5rt. 5 .rt \ 

Tu=p lx+-/-^—+-J 7 +-^ ls ), 

= pV m V n 


P = 


pn 

477a 3 


Cs-i) 

( 5 - 2 ) 

(5.3) 

(5-4) 


and_^ mn is given by the expression on the right-hand side of (3.9). Now, at the boundary, 

(5-5) 


Ann ~ g^s ( “ + X m X n ) + ~^- 5 ( - a S S mn + X m X n )x?T} |„ 



43 2 

so that 


< 7 . L. Clark 




2iVmx m 
8 ?ra z 
o. 


(-i + i) + 


27 mx m 
16ira z 


(-1 + x)x 3 U\g 


( 5 - 6 ) 


Since the direction cosines of the normal to the surface of the body are x 8 ja } the equation 
($.6) expresses the condition that the normal components of the stresses vanish at the boundary. 
Again, we have 


Pn 9 * 


21 Urn 
87 7a 5 


( 3 <z 2 - 5 ^ 2 ) + 


2>jmtj\pc 8 

1677a 5 


(S^ 2 “ 7 ^ 2 ). 


( 5 - 7 ) 


Integrating through the body we obtain the result 

\pndV= o. (5-8) 

In the 1938 investigation the equation (3.3), from which the second-order terms in k u 
were derived, was assumed (using the present notation) to be of the form 

k u\ss =M/>o + 2 Po v 2 ) + iy' ul 4 i + \y u[s y' ul „ (5.9) 

where p Q is the invariant density. The present investigation shows that the stress terms p mn 
should have been considered, and (5.9) should, in fact, have been written 

^44 [ss = MpO + 2p0» 2 + 2 $11) + iy'u\4A + ly'u\ S Yu\y ( 5 - 1 °) 


The correctness of the amended de Sitter result in the 1938 paper is entirely due to the 
equation (5.8). 

The proper density is given by 



1 +- v 2 -—2 
2 2 a? 


*]X 8 X r V 8 V r \ 

~ a* ) 



1 

2 


r 2 v 2 
2 a 2 


JX s X r V 8 V r \ 

2 a 2 ) 


+ #/)(?? + x s Ui B )-pu 




( 5 -i 1) 


on using (1.10) and neglecting U\ sr and higher derivatives. Also, to the order considered, 


dt 




Accordingly, we have 




r 2 v 2 jx s x r v s v r 
2a 2 2 a 2 


“ Pu • 


(5-12) 

( 5 -i 3 ) 


The invariance of the quantity 


If \ T J s V(~g)dxdydz 


is known from general theory, but it is instructive to deduce this result directly from (5.13); 
we have 




jt 


T—^/(-g)dccdydx 


m{i+v\x-&-■&)} 


on using (5.4) and (5.8). 


(S-i 4 ) 



The Internal and External Fields of a Particle in a Gravitational Field 433 


REFERENCES TO LITERATURE 

Clark, G. L., 1941. Proc. Roy. Soc ., A, clxxvii, 227-250. 

Eddington, A. S., and Clark, G. L., 1938. Proc. Roy. Soc., A, clxvi, 465-475. 

Einstein, A., Infeld, L., and Hoffmann, B.,,1938. Ann. Math., Princeton, etc., xxxix, 65-100. 
Sitter, W. de, 1916. Mon. Not. R. Astr. Soc., lxxvii, 155-184. 


{Issued separately May 20, 1949) 



( 434 ) 


XLVL—' The Mechanics of Continuous Matter in the Relativity Theory. By 
G. L. Clark, Trinity College, Cambridge. Communicated by Sir Edmund 
Whittaker, F.R.S. 

(MS. received September 5, 1947. Revised MS. received July 26, 1948. Read November 8, 1948) 


1. Introduction 

The extension of the classical theory of elasticity to relativity mechanics presents many 
problems of great complexity. The chief difficulty arises from the fact that whereas in the 
classical theory the stress-strain relations are valid only in the case of small strain, in a complete 
relativistic treatment the squares and cubes of the components of strain must necessarily be 
retained. Although the general formulation of the laws of elasticity is still unsolved, as far 
back as 1917 Lorentz (1) published a theory applicable to the case of small strain, and as an 
illustration considered the problem of an incompressible homogeneous disc rotating with a 
small angular velocity about its axis. He claimed that the radius measured by an observer 
at rest on the disc is reduced from its original value a by an amount -Jc o 2 a 3 . This result 
has never been universally accepted but, on the contrary, many irrelevant and unjustified 
criticisms of Lorentz 7 argument have been expressed. Some of these have been collected 
together and reproduced, with references, in a Memoir by Seyuan Shu (2). The author of 
the present paper takes the view that Lorentz’ theory is in the main correct, but points out 
that both Lorentz and his critics have overlooked the postulate that the velocity of propagation 
of the dilatation cannot exceed the velocity of light. On working out the consequence of this 
postulate it is found that there is no contraction of the radius. For material in which the 
waves of dilatation travel with the fundamental velocity there is no alteration in the radius of 
the disc. 

In the second half of the paper (§ 5) the equation of equilibrium of a continuous static 
distribution of matter is discussed in the case in which terms involving the cube but not the 
fourth power of the density are retained. 


2. The Problem of a Rotating Disc or Cylinder: The Classical Theory 

In textbooks on elasticity the following equations and expressions are derived and explained 
in some detail for isotropic bodies. When the body forces may be neglected, the equations 
of equilibrium are, using rectangular cartesian co-ordinates,* 


d .hi- r 
dx 4 pfi ’ 


(2.1) 


where p is the density,jf- are the components of the acceleration, and the components of stress 
pit are given by 

p ij =XA8 i j + 2p* ij , (2.2) 

where the strains e# are related to the components u i of the displacement by the expressions 

(dui duj\ 

I-—- 

dx t J 

and A, p are constants for the body. Further, it is shown that when the material is homo¬ 
geneous the dilatation A is propagated according to the equation 


fdUi duA 

eiS ~^\8x s + dxj’ A ~ e ™ 


(2-3) 


(A + 2ju) V s /! = p 


8 *A 


Throughout the paper italic indices run over the values i, 2, 3 only. 


The Mechanics of Continuous Matter in the Relativity Theory 
The velocity (c Q ) of propagation of A is accordingly 


'.-VI — 11 




435 


( 2 - 4 ) 


For incompressible material A>>ju, and in this case (2.4) reduces to 

*o = VWp )- (2.5) 

The equations (2.i)-(2.3) can readily be expressed in alternative forms when other co-ordinate 
systems are used. It is, in particular, convenient to use cylindrical polars (r, 6, z) when the 
displacement is symmetrical about the # 3 =2-axis. If, in addition, there is no displacement in 
the direction of the 2-axis, the equations (2.i)-(2.3) take the forms 


d (dU U\ _ 


J dr\ dr 

where f Y is the radial acceleration, and the only non-zero components of strain are 


e rr — 


dU 
dr 5 


U 


( 2 . 6 ) 


(2-7) 


where £7 is a function of r only. The only non-vanishing components of stress are given by 


JdU U' 

+ 7 



dU 

+2fl A’ 


u 

+2 f i 7> 


When the system under consideration consists of a homogeneous cylinder of radius a rotating 
with angular velocity co about its axis, the solution is found to be [e.g. Love (3)] 


pw*r / 2A + 3/* 
8(A + 2 /j,)\ A+ju. 



For an incompressible body we take A >>/x, and so the displacement and stresses are given 
respectively by 


c/=e Sr (2a2 -" 2)= lv (2a2 -^ 


(2.8) 


and 


At =Ab =Pu* = ipw’V - r*) 


on using (2.5). The solution is applicable to the greater part of the cylinder but is defective 
near its ends. 

3. The Relativity Modification of Lorentz 


We now give a brief account of the argument given by Lorentz. Denoting the velocity 
of light by c and the time co-ordinate x 4 by i, the metric for an observer S on the cylinder 
when it is stationary is 

ds 2 = - dx x 2 - dx 2 2 - dx z 2 + c 2 dt 2 . (3,1) 

Applying the transformation 

x x — x x cos off - x 2 sin oof 7 x$—x z \ 

#2= x{ sin (of + x 2 cos oof, t = 

the metric for an observer S' at rest on the rotating system is 

ds 2 = - dx x - dxf - dx 3 2 + 2 oox 2 dx{df - 2 oox x dx 2 'df + {c 2 - oo\x x + x' 2 2 )}df 2 . (3.3) 




43 $ 

Now for the metric 


G. Z. Clark 


ds 2 fjw^lx ^cloc y) 

the invariant spatial interval dl is given by 


— dl* — ( g mn ” 


/ 


(ix, V=I, 2, 3, 4) 

dx m dx m (m, n = i, 2, 3). 


Hence the spatial intervals for S and S' are respectively 
dl 2 = dx^ + dx 2 + dx 2 ^ 


in-{i +(.+-PJ-. W?, I 

V c 2 -r 2 <o 2 1 c 2 -r 2 a> 2 1 V c 2 -r 2 a) 2 J 3, J 


<D 2 3 N 2 


where 


r 2 as ^2 4, -^ 2 2 _ #'2 4. #' 2 . 

In cylindrical polar co-ordinates (3.5) take the forms 

dl 2 — dr 2 -f r 2 d 9 2 + dz 2 y 

r 2 d6 2 . 

dl" 1 = dr 2 + — , —» + dz 2 . | 


(34) 


(3-5) 


(3-6) 


We accordingly deduce that the length of a standard measuring rod pointing in the radial 
direction is unaltered by the rotation, but a factor 1/vX 1 - r 2 o) 2 jc 2 ) has to be applied when the 
rod is at right angles to the radius. When rcojc is small, this results,in S' observing a strain 
e de -r 2 (x} 2 j2c which has to be inserted in (2.7). Taking A to be infinite, the dilatation A is 
unaltered if we have an additional displacement U', corresponding to e' ee , where 


dlT U' r 2 co 2 

4. 4. r~ = O. 

dr r 2 c 2 

The solution of .this equation is 

U'= -r z a) 2 l&c 2 . (3.7) 

Adding (2.8) and (3.7), we find that the displacement has the value 

o) 2 r f 1 r 2 ] 

(3 - 8) 

The density in the strained state can be calculated, since the number of particles in a ring of 
radius r and width dr in the unstrained state is the same as the n umb er in a ring of radius 
r+U and width dr(i -bdU/dr) in the strained state. Remembering to insert the Lorentz 
factor 1 /V(i - r 2 o> 2 {c 2 ), the density p of the rotating cylinder is found to be 


o 2 - 


(3-9) 


The change in the radius of the cylinder is determined by putting r=a in (3.8); this gives 


U - 


By taking c 0 to be infinite, Lorentz obtains the result 


(3- IO > 


U= - 


A 3 

8 c 2 * 


( 3 -n) 


The contraction (3-11) was accepted by Eddington (4), who gave an alternative method of 
attacking the problem. According to him, the particle density (referred to proper measure) is 
unaltered by rotation in the case of an incompressible disc or cylinder. That is, Eddington 



The Mechanics of Continuous Matter in the Relativity Theory 437 

asserts that p'—p; and consequently he is, in effect, assuming like Lorentz that c Q is infinite 
[see equ. (3.9)]. 

Now, from (3.6), the proper element of volume for the rotating system is 

dV-(i - r 2 oj 2 jc 2 )~^rdrdddz^ (3*12) 

and the total number of particles in a disc of thickness h is accordingly 

[p'd V = 27 rb f p'(i — r 2 co 2 jc 2 )^rdr, 


where a' is the radius of the rotating disc and p is given by (3.9). Since this number must be 
unaltered by the rotation, a* must be a function of oj such that 

I p'(i - r 2 co 2 /c 2 )~ l ~rdr = constant. 

Jo 


Expanding the square root and neglecting a*aj*/c% we find 



in agreement with (3.10). 

By assuming that c 0 is infinite, Eddington naturally obtains the Lorentz result 


a 



Sc 2 j 


It has already been pointed out, however, that the greatest possible value of c 0 is c. In this 
case (3.10) gives U= o at the boundary and there is no alteration in the radius. If c 0 < c, there 
is necessarily an expansion. 


4. Further Remarks on the Lorentz-Eddington Theory 

In the previous section we have made no attempt to deal with the criticisms which have 
been brought against the arguments of Lorentz and Eddington. As we mentioned in the 
Introduction, many of these criticisms are irrelevant and need not be discussed. In this 
section, therefore, we shall only comment briefly on the two most interesting points raised 
by the critics. 

In his review of the problem Seyuan-Shu (ref. (2), p. 68), without giving any reason, 
rejects the use of the transformation (3.2). In my view, the justification for this transformation 
is that it does transform the mass motion part of the energy-tensor in the required manner. 
Retaining only the first power of the density, the non-vanishing contravariant components of 
the energy-tensor representing a system under isotropic stress rotating with constant angular 
velocity co about the # 3 -axis are 

T n =T 2^p pmaj £l_ Pi T i2 = _^ pm(0 & } 

r^-^pooocA 
T u =fPp m -^, = 

where i/jS 2 = 1 - r 2 oj 2 c 2 , r 2 = xf 4- x 2 2 , p m and / are invariants, and the leading terms in the 
metric are given by (3.1). 

On applying the transformation (3.2), the new values of the contravariant components are 

T rmn = g'mn^ T «» = g *«^ £2^ +g 'U p . 

That is, the energy-tensor is calculated with respect to an observer at rest on the rotating 
system. 



G. Z. Clark 


43& 

A further reason for retaining the transformation (3.2) is because it explains Michelson’s 
1925 experiment (5). In cylindrical polar co-ordinates (3*3) takes the form 

ds 2 = -dr 2 -r 2 dd 2 - dz 2 - 2 urHQdt + (c 2 ~ M)dt\ 


and so the velocity of light in a direction perpendicular to the radius is 

V — c^roo — cTv, (4.1) 


where v is the velocity of an observer at rest on the rotating system, and the negative sign is to 
be taken if the ray is travelling in the same sense as the rotation. In consequence of (4.1), 
if it were possible to transmit two rays of light in opposite directions round the earth, parallel 
to the equator, they would return to the starting-point at different times. In 1904 Michelson (6) 
showed that it is not necessary that the track should completely encircle the earth and calculated 
the difference in path between two light rays, one of which travels in a clockwise direction 
and the other in an anticlockwise direction. The experiment which he carried out twenty 
years later verified the validity of (4.1). 

In his discussion on the problem, Eddington obtains the proper element, dV, of volume 
by a different method from the one used by Lorentz and reproduced in § 3. He shows that 
dV can be expressed in the form 

dt 

d V ( g} ^dx^dx^dx^ (4*^) 

and this reduces to 

dV~(i - r 2 o} 2 jc 2 )-Wdrdddz 


for the metric (3.3). Although this expression is entirely in agreement with (3.6), Berenda (7) 
has, for some inexplicable reason, criticised Eddington’s treatment and has merely reproduced 
the Lorentz formulae (3.4) and (3.6). To remove any misunderstanding on this point, we 
observe that it is easy to see that (4.2) is only an alternative way of expressing (3.4). We 
find, by using the elementary theory of determinants, that the determinant g is equal to 


in 

in 

il 3 

i 14 


in 

i22 

g23 

£24 

<§ 44 “^J 

il 3 

i 23 

i 33 

<?34 

ili 

i 24 

£34 

<§44 


11 ^mn 

II, 

k 

K mn 

~imn ~ 

i 4 mi 4 n/ i 44 * 


On account of (4.3), (4.2) may be written 

dt 

dV=\lg u ( - K)~dx\dx 2 dx z . 

Now for an observer at rest 


and so (4.4) reduces to 


dV=\t(~ JT)dx 1 dx 2 dx 3 ; 


(4.3) 


(44) 


this is the volume element for the spatial interval (3.4). 

^ This spatial interval can also be calculated by referring an element of volume at a given 
point to “local” co-ordinates; that is, by considering the two-dimensional form of the Lorentz 
transformation. 

The three-dimensional form of the transformation is obtained by showing that x r x T — c 2 t 2 
is invariant under the transformation. 


X T — fi(<X f$ X s — V r tj £■),! 

t' = P(-* f x r lc+t), J 

where 

a rs = a sr> 


(4-5) 





The Mechanics of Continuous Matter in the Relativity Theory 


439 


provided that 

a^a^Sijl^+ViV 3 -/c 2 , v s a ls = v i} (4.6) 

and the v € are treated as constants. These conditions are satisfied by taking 
ay =(x - [I V 2 )h i} + iWiVjjc*, /r V^jc- = 1 -i/£. 

Now when dt=o, (4.5) gives 

dxf = fia rs dx s . 

Consequently 

dx{ 2 + dx 2 2 + dx z ' 2 = j8 2 a mi a ni dx m dx n 

= (S m „+j 3 h) m v n \c*)dx m dx n (4-7) 

on using (4.6). 

For rotation about the # 3 -axis, the components of velocity at the element considered are 
v 1 = - COX 2 , V 2 — 0 )X lf v 3 = o. 

With these values of v i9 ( 4 - 7 ) reduces to (3.4). 


5. The Equation of Equilibrium of a Continuous Static 
Distribution of Matter 

In the classical theory of fluids and elastic bodies the equations of equilibrium are 


dx°~ pX ™ (5-l) 

where f ms are the components of stress, p the density, and X m the gravitational attraction at 
the element considered. We shall now, in this final section, proceed to verify that, in the 
case of weak static fields for which the fourth power of the density may be neglected, the 
equation (5.1) is taken over unaltered in relativity mechanics if denotes covariant com¬ 
ponents of the stress and 

p-W-AVC-i). Cs-») 

X m =(44 (5-3) 

where {44, *«} is a Christoffel bracket and the components g in of the potentials g, iv are zero. The 
o-y pr^gsinns (5.2), (5.3) are those occurring in Whittaker’s (8) extension of Gauss’ Theorem 
for a static system. This can be written in the form 

pdx x dx 2 dx z ~ ~^dx 1 dx 2 dx 3i 
or 

47r|j'|(T 4 1 - T x l )^{-g)d Xl dx 2 dx 3 =^~(V( ~g)g ii {44, s})dx 1 dx 2 dx s . 

From (5.2) and (5.3) we have 

pX m = (r 4 4 -/,*)< -g){44, ^ 4 - ( 5 - 4 ) 

If we denote the Newtonian potential by -$<£, the non-vanishing components of g„„ g" are 

g mn =- 8 m *(i+$, g“=*~4>, 

and consequently 

( ~g)g M i 44 > m ) =K 1 - ( 5 - 5 ) 

correct to the second order in the density. From (5.4) and ( 5 - 5 ) we ^ ei1 obtain the relation 

pX m =U 2 V ( 5 - 6 ) 

correct to the third order in the density. 



440 C'i Clct-fli 

Now, in the case of strong static fields, the equations of equilibrium have the form 

W) s =°> (S- 7 ) 

where ( ) denotes covariant differentiation. Written out in full, (5.7) becomes 

riT s 

-~=- {aft ®T m a + {mfr a} 7 * (5.8) 

where the Greek letters run over the values i, 2, 3,4. Writing T m s =p m s , T m ^ = o y the equation 
(5.8) becomes 

= - fa, s}p m r + {ms, r)p* - fa, 4 }p m r + fa.4, 4}T (5.9) 

and, as we have already pointed out, the italic letters take the values i, 2, 3 only. 

Since the stresses are of the order of the square of the density in the case under discussion, 
we need only evaluate the first-order terms in the first three Christoffel brackets on the right- 
hand side of (5.9). Now, 

Pm !5S £ rS %’fli“ “ ( I ^fypmsi (5* 10 ) 


correct to the second order, and accordingly 


Also, 


\ rs , s }Pm tPmrg dXf 

& 

tPmrg dXr W™ dXr , 

c u s 1 . r J d&mt , 8gst ^g ms 

_ 1 . fdgmr fygn 8gms 

~ m \dx,dx m dx r 




%44 

ll 8x ’ 


■fa, 4}pm T = \Pmrg i4 "^ = 


fa4, 4}T£=\g^T£ 

ox m 


dcf> 


ax m 

Inserting these values in the right-hand side of (5.9), we have, correct to the order stated, 
Again, from (5.10) we have 

8x s (l 8x s 8x/ ms ‘ 


( 5 -«) 

(S-I2) 


Combining (5.11) and (5.12), we then obtain the equation of equilibrium in the form 


on using (5.6). 




P^-m 


We must emphasise that the theory only applies to weak fields, and,that this form of 
e equation of equilibrium does not persist in the case of strong fields. This is most easily 
seen by considering a sphere composed of a perfect fluid, for which T m n =g m y. 



The Mechanics of Continuous Matter in the Relativity Theory 


441 


Summary 

Little progress has been made in the development of a relativity theory of elasticity, 
although it has been realised that no disturbance can be propagated with a velocity greater 
than that of light. In 1917 Lorentz (1) gave a relativistic formulation of the laws of elasticity 
in the case of small strain and, applying the theory to the problem of a rotating, incompressible, 
homogeneous disc, he claimed that the radius as measured by an observer at rest on the disc 
undergoes a contraction. His result was accepted by Eddington (4) but was attacked by others. 
A great deal has been written on the subject, but it has never been pointed out that both 
Lorentz and Eddington were considering material in which the waves of dilatation travel 
with an infinite velocity. In this paper we define ££ incompressible” matter as that in which 
these waves are propagated with the velocity of light and Poisson’s ratio tends to the value f. 
This gives an upper limit to the modulus of compression h, which in this case is the elastic 
constant A, and as a result the expansion determined by the ordinary classical theory has to 
be taken into account. It is found that the £4 relativity contraction” is exactly cancelled by 
the “classical expansion”. Throughout the discussion on the rotating disc the analysis is 
restricted to the case of small strain. 

The equations of equilibrium of a continuous static distribution of matter are also 
investigated in the case of weak fields for which the fourth power of the density may be 
neglected. 


REFERENCES TO LITERATURE 

(1) Lorentz, H. A., 1917. Nature , cvi, 795. 

- , 1934. Collected Papers, VII, 171. 

(2) Shu, SEYUAN, 1945. Critical Studies on the Theory of Relativity, Princeton, N.J. 

(3) Love, A. E. H., 1944. Mathematical Theory of Elasticity, Dover Publications, N.Y., p. 146. 

(4) EDDINGTON, A. S., 1924. Mathematical Theory of Relativity, Camb. Univ. Press, 2nd Ed., p. 112. 

(5) Michelson, A. A., 1925. A sir. Journ., LXI, 137-140. 

(6) -, 1904. Phil. Mag. (6), VIII, 716. 

(7) Berenda, C. W., 1942. Phys. Rev., lxii, 2nd series, p. 280. 

(8) Whittaker, E. T., 1935. Proc. Roy. Soc., A, cxlix, 384-395. 


{Issued separately May 20, 1949) 



[ 442 ] 


XLVII._ Non-Associative Arithmetics.* By I. M. H. Etherington, University 

of Edinburgh. (With Three Text-figures) 

(MS. received January 8,1946. Revised MS. received July 11, 1947 - Read July 1, 1946) 

1. Introduction and Summary 

The systems of “ partitive numbers” introduced in this paper differ from ordinary number 
systems in being subject to non-associative addition. They are intended primarily to serve as 
the indices of powers in algebraic systems having non-associative multiplication, or as the 
coefficients of multiples in systems with non-associative addition, but are defined more generally 
than is probably necessary for these purposes. They are essentially the same as root-trees 
(Setzbaume) f with non-branching knots other than terminal knots ignored, with operations 
of addition and multiplication defined. 

Partitive numbers are of two kinds, partitioned cardinals and partitioned serials , defined 
respectively as the partition-types of repeatedly partitioned classes and series. For each 
kind, multiplication is binary (i.e. any ordered pair has a unique product) and associative. 
Addition is in general a free operation (i.e. the summands are not limited to two, and indeed, 
assuming the multiplicative axiom, may form an infinite class or series); but it is non-associ¬ 
ative, which means that for example a + b + c (involving one operation of addition) is dis¬ 
tinguished from (a + b)+c and a + (b + c) (involving two operations). A one-sided distributive 
law is obeyed: 

a(Eb t ) = 2 (<z 3 f ); in general (Lb z )a ¥* 2 (^<2). 

Partitioned cardinals are commutative in addition. 

Closed subsystems of partitive numbers occur when we suitably restrict the meaning of 
class , series or partitioned , and correspondingly restrict the freedom of addition. Thus 
partitioned ordinals are got by considering only well-ordered series; finite partitive numbers 
arise from finite classes or series, forming systems closed under “finitely free” addition; and 
n- ary numbers maybe considered; e.g. binary partitive numbers (correspondingto bifurcating 
trees), which form a system closed under binary addition, are obtained by taking partition to 
mean dichotomy. 

Altitude is defined for partitive numbers as by Cayley for trees, and many difficulties are 
avoided by restricting attention throughout to numbers which may be infinite but are of finite 
altitude. Some theorems (e.g. unique factorization) are proved by “non-associative induction” 
—essentially an induction on the altitude. 

The definitions are given first (§§ 4, 5, 7) in terms of the theory of classes, and are framed so 
as to emphasize analogies with cardinal, ordinal and serial numbers.^ In §§ 8,9 the arithmetics 
are redefined axiomatically. By adding fresh axioms further arithmetics are derived, 
appropriate for the indices of powers in algebras with special properties. The elements of all 
these arithmetics can be interpreted as classes of partitioned serials. ( Added in proof. —A 
simpler axiomatic formulation for finite partitive numbers is given in a forthcoming paper 
by A. Robinson, 1949.) 


2. Logarithmetic 

Using certain obvious conventions for denoting powers, the logarithmetic of an algebra 
with non-associative binary multiplication was defined as the arithmetic of the indices of 
powers of the general element of the algebra.§ The logarithmetic of an associative algebra is, 

* This paper was assisted in publication by a grant from the Carnegie Trust for the Universities of Scotland. 

f Cayley (1857, etc.). These are trees in which one knot is specified as the root of the tree. 

f Also with the arithmetics of partially ordered systems (see Birkhoff. 1937, 1940). Generalization on the 
lines of Birkhoff (1942) would doubtless be possible. 

§ Etherington (1939 a -> I 94 I b). We can consider also the logarithmetic of a particular element or other 
subset, which will be a homomorphism of that of the general element. 



Non-Associative Arithmetics 


443 


generally, ordinary arithmetic; for a non-associative algebra the logarithmetic has generally 
non-associative addition, and various non-associative arithmetics arise in this way from 
algebras of special types. 

The conventions are: x a+b means x a x b , x ah means {x a ) b , x a ” means . . . (x a ) a . . . with the 
power iterated n times. For example, 

x (2+1)2 — (x 2 x) 2 , x 2(2+1) = x 2 x 2 . x 2 , # {2+1)2 = (x 2 x) 2 (x 2 x). 

Identities connecting indices a,b,c,. . . are used to express identities between powers. Thus 
we write 

ab.c = a.bc, a(b + c) — ab + ac 


to indicate that in any algebra x ab ’ c =x a:bG since both mean ((x a ) b ) G , and x a(b + c) = x ab+ac since 
both mean (x a ) b (x a ) c . If multiplication in the algebra is associative, or merely associative for 
powers (i.e. x a x b .x c =x a .x h x G ), 

(a + b) + c = a 4 - (b + c ); 


if commutative, 


a + b=b + a ; 


if the algebra is palintropic,* 


ab = ba, , with consequences {a + b)c = ac + be , ab.cd=ac.bd\ 


also for all the palintropic algebras which I have encountered, 

(a + b) + (c + d) — (a + c) + (b -1- d). 


Corresponding generalized non-associative arithmetics arise if we consider more general 
algebras in which multiplication is not necessarily binary. We shall speak of n -ary, restricted , 
finitely free or free algebras according to the nature of multiplication. The definitions to be 
given for partitive numbers are suggested by considering the logarithmetic of a system in 
which the fundamental operation is free multiplication, with the convention for addition of 
indices generalized: or a t means Hx a K 


3. Laws 

We shall have to refer frequently to the following “laws 57 , various sets of which are obeyed 
in the various number systems. Cardinals obey all except the (U) and cancellation laws; 
finite cardinals obey the latter and (U £); serials obey the associative and right distributive 
laws, and ordinals obey also (subi,); finite serials and ordinals are isomorphic with finite 
cardinals. These facts (assuming the multiplicative axiom where infinite sums or products are 
involved) are well known. 


Commutative Laws 


Associative Laws 


Distributive Laws 


a +b=b + a, 
ab—ba . 


a 4- (b 4- c) = (a 4 - b) 4 - c } 
a»bc—a.bc 


a(b + c)=ab+ac, 
(b + c)a—ba + ca. 


(c+) 

(Cx) 

(a+> 

(a x ) 

(ds) 

<di) 


* I.e. (x a ) b Etherington (1945; cf. 1940, Theorem XIV, and 1941 b); also Murdoch (1939, 
Theorem 10, Corollary). The palintropic property is a consequence of #(«+&)+<«+*> (Etherington, 

1945, p. 120; Theorem 4 in the present paper). 



444 /* M. H. Etherington 

To this familiar list we may add a further pair, obvious consequences of the (c) and (a) laws, to 
which the name entropic will be given: 


Entropic Laws 

(a 4- b) + (c + d) = (a + c) J r(b J r d\ (e + ) 

ab .cd—ac.bd. (e x ) 

The above laws refer to addition and multiplication as binary operations. They will 
require amplification when the operations are not restricted to being binary, and the corre¬ 
sponding statements will then be called full laws: these are to be interpreted as making certain 
assertions whenever these assertions have meaning, i.e t whenever the sums and products 
mentioned exist. 


Full Commutative Laws—Full Associative Laws 

A sum or product 11 has the same value in whatever manner the a 7 s may be re-ordered 

—or brackets inserted . ' (C + ) (C x ) - (A + ) (A x ) 


Full Distributive Laws 

a&K) = S (abf (Lb t )a - E&fl). (D.r) (D l ) 

Note that the right distributive law (d R ) or (D R ) asserts that a factor of a product can be 
distributed on to the summands of a factor on its immediate right . 


Full Entropic Laws * 

s z**, n n^=n n«*. (e + > (e x > 

i j j % i j j i 


Law of Induction (Definition of finite for cardinal numbers.) 

Any finite number other than i can be formed from i by repeated additions of i. (i) 

The principle of “non-associative induction” (§§ 5, 8) may be regarded as the amplification 
of this. 

If a number can be expressed as an unbracketed sum (product) of numbers, the latter are its 
summands (factors) ; a number other than 1 is prime if it has no factors other than itself and 1. 


Unique Separation 

Any number other tkan 1 can be resolved into summands which are uniquely determined in 
order. (U|) 

- * * are uniquely determined except as regards order . (U$) 


Diese are consequences of the (C) and (A) laws. It may be remarked that in ordinary arithmetic the 
operations of subtraction and division are non-commutative and non-associative; also division is unlike multi¬ 
plication as regards the distributive law (Dr), though it satisfies a law corresponding to (D L ). However, the 
tour operations -f- s x, -j- all satisfy full entropic laws; e.g. it is true that 

{a -i- b) (c -r d) =(a~c) -r {b-r d) } 

{a-b-c)-{d-e -/)=(« -d) - ( b-e)-(c -/). 

Toy<^La I (i^i) Ilt ^ t ^ e ° Iy < 3 uas ^‘& rou P s (Murdoch, 1939). n-axy entropic systems have been studied by 



Non-Associative Arithmetics 


445 


Unique Factorization 

Any number , if not prime or i, can be resolved into prime factors which are uniquely 
determined in order. (Uf ) 

. . . are uniquely determined except as regards order. (U *) 

Cancellation Laws 

Each of the statements 

a + b = a + c, b + a = c + a, ab=ac , ba — ca 

implies b — c. (sub L ) (sub E ) (div L ) (div E ) 

These can be amplified, e.g. 

Sa^ = Tibi, where the two series have equal first terms , implies equality of these sums with 
the first terms omitted. (SUB L ) 

In all the arithmetics to be considered, multiplication will be a binary operation obeying 
(d E ) and (a x ). Addition will be a free operation unless otherwise stated; but the results 
proved for free addition will be valid with suitable modifications when it is restricted in any way, 
and whenever full laws are referred to it is implied that they are interpreted according to the 
nature of the operation. Then an arithmetic will be called commutative if (C + ) holds, associative 
if (A + ), entropic if (E + ), palintropic if (c x ). 


4. Partitive Numbers 

The word class is used in its usual sense except that the null class will always be excluded, 
and subclass means “non-null subclassA class is simply ordered and will be called a series 
if there is a transitive relation of order between any two distinct elements A, B of it (i.e. either 
A precedes B or B precedes A, not both; if A precedes B and B precedes C\ then A precedes 
C). It is well ordered if it is simply ordered and every subclass has a first element. 

A simple partition of a class is a separation of the class into proper subclasses. A simple 
partition of a series means the following. The series considered as a class is simply partitioned, 
the ordering relation is preserved within each subclass, and the partitioning is such that the 
ordering relation applies also to the subclasses {i.e. is such that if P, Q are any two distinct 
subclasses, either all elements of P precede all elements of Q and we say P precedes Q y or vice 
versa). 

Let any class of elements be partitioned in stages as follows. Let it be simply partitioned; 
let all subclasses which do not consist of single elements be again simply partitioned; and so on, 
until at a final stage all the subclasses consist of single elements. Such a “partitioned class” 
will be called a clan. If a series is dealt with in the same way, being partitioned perhaps 
repeatedly and ultimately into single elements, the resulting “ partitioned series ” or “ simply 
ordered clan” will be called a school. The degree S = S(^) of a clan or school s is the cardinal 
number of its elements; the altitude a = a(s) is the ordinal number of partition stages. A dan 
or school is finite if its degree is finite. 

The words “ and so on ” in the preceding paragraph could refer to an infinite series of stages 
preceding the final one if a rule determining the subclasses at any stage is given. Thus infinite 
altitudes could be considered; but for simplidty we shall assume throughout that the altitude 
is finite. 

A class consisting of a single element, which is necessarily a series, cannot actually be 
partitioned, but will nevertheless be called a dan or school of zero altitude and denoted 1; 
thus S(i) = 1, a(i) =0. 

The word society will be ambiguous: by giving It different meanings we shall arrive at 
different arithmetics. Interpreting it consistently as class, the numbers at which we shall 
arrive are the (assumed already familiar) cardinal numbers, and the interpretation series leads 
to serial numbers (or ordinals if the series is well ordered), with addition and multiplication 



446 /. M. H. Etherington 

correctly defined in each case. The interpretations clan and school lead respectively to 
partitioned cardinals or commutative partitive numbers , and partitioned serials or non - 
commutative partitive numbers (partitioned ordinals if the school is well ordered). For any 
interpretation, similarity of societies must be suitably understood. We shall call two societies 
similar when there is a one-one correspondence between them element to element, and subclass 
to subclass at every stage if they are partitioned, in which all relations of inclusion of elements in 
subclasses and all relations of order in the one society hold for the corresponding elements and 
subclasses of the other. 

By the word number in §§ 5—7 we shall understand a society, and we shall call two numbers 
equal (a=b) and think of them as the same number* if the two societies are similar.! A 
number is cardin al , ordinal, serial or partitive according to the meaning of society as already 
explained. With any interpretation of society it will be necessary to verify that the addition 
and multiplication which we are going to define are consistent , i.e. that if a i =b i for a series of 
values of the variable i, then 'Za i =Htb i and ILzi=IL^. 

5. Addition 

Exclusive societies are such as have no common element. 

Definition. —To form the sum of a given series of numbers Si (where the different values 
of the variable i form a series). Taking it for granted that we can find exclusive societies 
with these numbers, combine them into a single society s } partitioned (if society means clan 
or school) into the already partitioned classes s i9 preserving (if society means series or school) 
the order in which they are given and the order within each society. Then we define 
the number s is the sum of the numbers s iy which are the summands of s. 

The exclusive societies required in the definition may be constructed by taking as the 
representative of each s t the society consisting of all ordered pairs (/, S t ), where S^s^ ordered 
and partitioned if necessary so that it is similar to s^ 

Evidently 8 (s) — ESC?*); and, if society means clan or school , a(s) — 1 +max afo). 

Addition is a free operation, but can be restricted in any way by suitably restricting the 
meaning of the word series in the definition. Consistency for each interpretation of society 
may be verified: this requires the multiplicative axiom if the number of summands is infinite. 
Addition is fully associative if society means class or series , fully commutative if it means class 
or clan y fully entropic if it means class y fully commutative and entropic if society and series both 
mean finite series . 

It follows from the definitions of addition and equality that the summands of a partitioned 
cardinal [partitioned serial] 1 are uniquely determined [and their order is -unique]; they are 
the numbers of the societies into which 5- is first separated in the process of partitioning, 
partitioned [and ordered] as in s; their altitudes are all less than a(s), and at least one of them 
has altitude a(s)-i. 

Thus partitioned cardinals obey (U+), partitioned serials obey (Uf), and in consequence 
both obey the full additive cancellation laws (SUB L ), (SUB R ). 

It also follows from the definitions that a partitive number s is equal to a bracketed sum of 
a series of S(j-) i’s. To obtain this representation, express s as the sum of its summands, 
-express each summand which ^ 1 as the sum of its summands, and so on. The process 
terminates after a(s) such separations. We shall call this law 

Non-Associative Induction 

Any partitive number other than 1 can be formed entirely from 1 *s by repeated 
additions . (X) 

Note that (I) applies to infinite partitive numbers (of finite altitude). 

Hence to prove a theorem 6 (a) involving an arbitrary partitive number a, it is enough to 
prove that a series of propositions 6(s^) together always imply 0(Ss^), and that 6(1) is true. 

* E.g. when words such as “unique” are used. 

■f It convenient to draw no distinction in their definitions between a number and the society to which it 
applies, but to distinguish them in use by calling two numbers equal where we should ca.11 the two societies 
similar. Alternatively, the number of a society could be defined as the class of all societies similar to it. 



Non-Associative Arithmetics 


447 


6. Notations 

A finite clan or school, and the corresponding partitive number, can be represented in 
various ways : 

(i) By inserting brackets in a row of symbols representing the elements. (2) By using 
dots instead of brackets, more dots between elements indicating prior partition. (3) By trees. 
The root of a tree stands for the complete class, the other knots for the subclasses, the free knots 
for the elements. The altitude is the number of knots above the root on the longest branch. 
(4) By the symbols of ordinary arithmetic, with the abbreviations * 2 = (1 4-1), 3 = (1 +1 +1), 
4 = (i +1 +1 +1), etc. for numbers of unit altitude. Until multiplication is introduced (§ 7) 
this notation scarcely differs from (1). 


II. xxxx 


III. 




IV. 4 



3 + i 



xx.x :x xx.xx 

\ V 


2 + 1 +1 
Fig. 1 


(2 +1) +1 


2 + 2 



x.xx.x 


I+2 + I 


x.x.xx 


x.xx : x x : xx.x 


1 + 1+2 (1 + 2) +1 

Fig. 2 


1 +(2 + 1) 


x: x.xx 



I+(l+2) 


From a class of four elements we can form five dissimilar clans, of altitudes 1, 2, 2, 3, 2, as 
shown above in the second, third and fourth notations, labelled II, III, IV. These therefore 
are the distinct partitioned cardinals of degree 4. 

From a series of four elements we can form eleven schools, viz. the five societies already 
symbolized, now ordered, and the six further types shown in IT, III', IV'. Thus there are 
eleven distinct partitioned serials of degree 4.f 

More generally, any partitioned cardinal or serial of altitude 1 may be represented by the 
symbol used for the corresponding unpartitioned number (if an appropriate symbol exists), 
and the fourth mode of notation can then be used for infinite partitive numbers. 


7. Multiplication 

The definitions below give the correct result for cardinals and for serials; but-for serials it 
should be pointed out that some writers denote by ha what we shall call ah . The notation 
used here is conditioned by the application to logarithmetic. 

Definition.— To form the product of a given pair of societies or numbers a , h , consider 

* For the logarithmetic of a binary algebra it is more convenient to define 3=2 + 1, 4=(2 + i) + i, 
5 = ((2 +1) +1) +1, etc., as in my previous papers. 

t The problem of enumerating the partitioned serials of given finite degree S is the same as that of enumerating 
the schools which can be formed from 8 elements, and was solved by Schroder, 1870, Problem 2; cf. Etherington, 
1941 c, Case 5. But the enumeration of partitioned cardinals of given degree (Cayley, 1857, concluding para¬ 
graphs) is not the same as the corresponding problem on clans (Schroder, ibid., Problem 4); e.g. from three 
elements A,B, C we can form four clans {ABC, A . BC , AB. C,AC.B) but only two partitioned cardinals (3,2 +1). 



448 /. M. H. Etherington 

a society p similar to the second factor h, whose elements are societies similar to the first factor 
a; and then regard p as consisting of the elements of the latter societies. Then we define 

ab—p\ 

the number p is the product of a and b; a, b are left and right factors of p. Evidently 

S(p) = S(a)S(b), a(p) = a(a) + a (b). 

This is illustrated in the following example, where the trees denote either clans or schools:— 


Y X/ 

ab 

Fig. 3 

The following is an equivalent constructive definition. 

Definition. —The product ba (not ab) is the society consisting of all ordered pairs A B where 
Am, Beb; ordered lexically if a and b are ordered; partitioned if a and b are partitioned, 
according to the following rule. Consider subclasses of the class of pairs, all pairs with the 
same A going into the same subclass; partition this class of subclasses similarly to the second 
factor a; partition each subclass similarly to the first factor b. 

To illustrate, using the same example as above in a more suitable notation: 

a = A.BC.D, b=EE.G, 
ab — EjL.EfiEc-Ej) : F&.F b Fq.Fj) :. Ga * G b Gq . 
ba—A^A^.A^ i.BeBf^Bq. : C B Ci?.C@ : 



The consistency of the definition, for each interpretation of society, is easily proved. 
Multiplication thus defined is a binary operation, in which i is always neutral. It is associative, 
but is non-commutative and non-entropic (unless society means class or finite series ). 

The right distributive law, cdLb i — S(^), may be proved from the definition, but is most 
easily seen to be true by considering trees. Place the trees representing the b t in order; to add 
them, we have to join their roots to a single new root; to premultiply them by a, we have to 
erect the tree for a at each free knot of each of them. Then (Dr) merely asserts that the order 
of these two operations is immaterial. To illustrate the law, using the examples of multi¬ 
plication already given, we have 

(l+2 + l)(2 + l)={(l+2 + l) + (l+2 + l)} + (l + 2 + l), 

(2 + l ) (l +2 + 1 ) = (2 + l ) +{(2 + l ) + (2 + l )} + (2 + l ). 

We shall prove the law of unique factorization (U®) for partitive numbers. As a lemma, 
we need the left cancellation law (ab—ac implies h—c), and we shall prove this by non- 
associative induction (I). The right cancellation law will then follow from (U®). 

Considering partitioned cardinals, we shall say that a particular number a has the left 
cancellation property (i.c.p.) if, for all p, pa ~pq implies a=q. Let sbe a number whose 
su mm ands s i all have the Lc.p. Then ps —pq implies a (q) — a(s) ¥* o, implies q ¥* 1, and therefore 
implies where q j are the summands of q. Applying in succession (Dr), (U$) 

and the Lc.p. hypotheses, this implies that the series s t is a derangement of the series q^ [for 
partitioned serials, obeying (U® ), we should conclude further that i*=yj, and hence implies 
s=q. Thus s has the Lc.p. if its summands all have this property. Now 1 has the Lc.p., and 



Non-Associative Arithmetics 


449 


hence by (I) all partitioned cardinals [and similarly all partitioned serials] have the l.c.p. 
This proves the l.c. law for partitive numbers. 

It follows from (U + ) and (D a ) that a partitive number has a left factor a distinct from itself 
if and only if all its summands have the same left factor a . A partitive number other than i is 
prime if it has no factors other than itself and i: this is the case if and only if its summands 
have no common left factor other than i. If s=pqr . . . where p is prime, p is a left prime 
factor of s. Evidently every partitive number other than i has a left prime factor (equal to 
itself if it is prime). If a partitive number has two distinct left prime factors, then the same 
must be true of each of its summands. Now numbers of altitude i, being prime, have unique 
left prime factors, and hence by (I) it follows that the left prime factor of any partitive number 
other than i is unique. From this with (div L ), (U®) follows, and (div B ). 

Exponentiation .—As multiplication is associative, the continued product of n equal 
numbers a may be denoted a n . No definition of exponentiation of a partitive number with a 
partitive exponent is offered. 


8. Abstract Non-Associative Arithmetics 

In this and the next section we shall characterize various arithmetics abstractly. The 
four denoted P s, P G , A 5 , may be interpreted as consisting respectively of partitioned 
serials, partitioned cardinals, serials, cardinals. The rest are intermediate between these 
(homomorphisms of the whole or part of P s which can be represented homomorphically on 
A c ), with the exception of the collapsed arithmetics (lacking the second property). 

An arithmetic will be taken to mean a system consisting of elements called numbers to 
which the following six postulates apply. 

( + ) For any series of numbers there is a unique number called their sum. We leave it to 
be stated in each case whether this is a free, ?z-ary or otherwise restricted operation, and interpret 
series accordingly, defining it as in § 4 in the free case. 

( x) For any ordered pair of numbers there is a unique number called their product. 

(=) There is a relation symbolized as equality between elements, having the usual formal 
properties of an equivalence relation, and such that the operations ( + ), (x) are consistent. 
Equality may mean absolute identity, or membership of the same equivalence class. For 
verbal convenience we assume the first meaning and use words such as “same”, “unique” 
which will otherwise need periphrasing. 

(Or) The full right distributive law. 

(1) There is a number 1 which is neutral in multiplication (xa=ai=a). 

(I) Non-associative induction. All other numbers can be derived from 1 by finite repetition 
of the operation of addition (not necessarily in unique ways). 

These postulates are logically consistent, for in partitive numbers, serials and cardinals we 
have four examples of systems to which they apply. 

All six postulates are assumed and used in the following theorem. 

Theorem i. (a x ). 

Proof \—Consider a series of numbers s t which are such that pq^s^p .qs t for all numbefs 
p, q. Then by (Db) 

pq.'Ls i = I l (pq.s i ) = 'Z(p. qs f ) =p . HqSi =p. q'ZSf. 

Thus s = Si* has the property/^. 5 =/. qs for arbitrary/, q if the s t separately have this property. 
Now 5 = 1 has this property, by (1); and hence by (I) all numbers have it, i.e. (a x ). 

The postulate (I) implies that any number other than 1 has a series of summands; also that 
to every number there corresponds at least one partitioned serial, namely that which is derived 
from 1 by the same operations of addition; the correspondence in the opposite direction is 
unique. We shall call this correspondence the p.s. notation for the abstract numbers; using, 
e.g., the bracketed numerical representation of § 6 (IV) (IV'), it enables us to denote in writing 
all numbers whose derivation from 1 is sufficiently simple, including all finite numbers (i.e. 
those representable by finite partitioned serials). In this notation we can add and, using (Db), 
multiply numbers correctly. Also we can speak of the degree and altitude of a number. 



450 /. M. H. Etherington 

me anin g those of a corresponding partitioned serial; but we cannot say that the summands, 
degree and altitude are uniquely determined {i.e. the same for all equal numbers), since we do 
not know without some further postulate what numbers in the p.s. notation can be equated. 
The existence of such a notation proves the following theorem. 

Theorem 2. —Partitioned serials form the most general system satisfying the above six 
postulates; all other arithmetics are either isomorphisms or homomorphisms of it, or * 
of subsystems of it. 


9. Examples of Abstract Arithmetics 

We shall assume henceforward a further postulate concerning equality, namely: 

[=] Two numbers, derived from 1 by stated operations of addition and multiplication, 
are equal if and only if their equality is a logical consequence of the other postulates of the 
arithmetic. 

Thus we shall now always assume the seven postulates 

(+) (X) (=) [=] (Db) (I) (I), 

together with certain identities specified in each case as extra postulates. Then [=] implies 
that two numbers given in the p.s. notation are equal if and only if they have the same notation 
or can be proved equal by application of the extra postulates. 

Unless otherwise stated, addition is a free operation. 

Partitive Arithmetics 

P s {arithmetic of partitioned serials '): no extra postulates. This means that the p.s. 
notation is a one-one representation of the arithmetic, for there are no postulates by which two 
numbers with different p.s. notations could be proved equal. (U +) follows. 

Pc (arithmetic of partitioned cardinals ): extra postulate (C + ). (U $) follows. 


Palintropic Arithmetics 

U s {palintropic serials): extra postulate (c x ). 

Ho {palintropic cardinals): extra postulates (C + ) (c x ). 

It will be shown in Theorem 3 that (c x ) as an extra postulate can be replaced by (D L ). 

Entropic Arithmetics 

E s {entropic serials): extra postulate (E + ). 

Ec {entropic cardinals ): extra postulates (C + ) (E + ). 

These arithmetics are also palintropic, for it will be shown in Theorem 4 that (E + ) as an extra 
postulate includes (c x ). Thus E#, R c are homomorphisms of IIlie- The converse is not 
true; e.g. (1 + 2) +(3+4)==(i + 3) + (2 +4) in E s , Ec, but not in lie, lie. 

Associative Arithmetics 

A s {serials): extra postulate, either (A + ) with addition free or (a + ) with addition binary. 
Ac {cardinals): ditto , together with (C + ) or (c + ). We omit the proof that the arithmetics 
thus abstractly defined are isomorphic with those of serials and cardinals which we have 
already supposed defined class-theoretically. (A + ) with (C + ) of course implies (E + ), so that 
A a is a homomorphism of Ec, in fact of each of the seven preceding arithmetics. If finite 
numbers only are considered, (A + ) as an extra postulate implies (C + ), so that A s and Ac are 
abstractly the same. 


If addition is restricted. 



Non-Associative Arithmetics 


45i 


The homomorphism relations of the above arithmetics are depicted by arrows in the 
following scheme:— 


N on- Commutative 


Commutative 

Arithmetics 


Arithmetics 

Partitive: 3 

? 8 


Pc 


\ 



Palintropic: 

n a 


ric 


+ 


* 

Entropic: 

1 : Es 

1 


R c 

1 

Associative: 1 

V 

^8 


* 

A c 


In the P, II and E arithmetics the degree and altitude of a given number are uniquely 
determined. For example, to prove this in R s we have, from § 5, 


3(2 So*) « 228 ( 0 *) = 228 ( 0 *) - 8(220*), 

i j i 0 5 * 5 i 

a(220*) = 1 +max{i + max a(0*)} 

i j (alii) (fixedi) 

= 2 4 - max a(0*) = a(220*); 

0 i 

thus no application of (E + ) can alter the degree or altitude of a number in the p.s. notation. 
Hence if two numbers of R s are equal, they have the same degree and altitude. In A s and A c 
the degree of a number is of course uniquely determined, but not its altitude. 

We recall that (U®) and cancellation laws hold in and P <7. (Proofs as in § 7.) It 
seems likely that (U*) and cancellation laws can be proved for IIs, lie, E#, E^. If not, 
further arithmetics may be obtainable by imposing them as extra postulates. Theorem 5 will 
show that if multiplicative cancellation laws are assumed, the postulate (c x ) of H s and II c can 
be replaced by (e x ). Regarding factorization, it should be observed that the prime factors of a 
number will generally be different in the different arithmetics. For example, [(1 4 - 2) 4- (3 4 - 4)] 
+ [(2 4 -4)+ (1+3)] is prime in P s , Pc, II5, lie, Es, but is equal to [(1 + 2)+ (3+ 4)]2 in R C} 
and to 2 2 5 in A#, A c . 

The proofs of the theorems referred to above will now be given. Their purpose is to 
show that certain selections of postulates do not give distinct arithmetics. The postulates 
(+)( X )( == )( I ) are tacitly assumed. 


Theorem 3. —(I) (D b ) (c x ) are together equivalent to (I) (D B ) (Di). 

Proof .—It is obvious that (D B ) with (c x ) implies (D L ), and we shall prove the converse, 
that (I) (D b ) (D l ) imply (c x ). 

Let s i be a series of numbers such that for an arbitrary number/, s € p—ps * Then using 
both (D b ) and (Dl), we have 

(S sbp = SC Sip) = =pZs t . 

Thus s — TiSi has the property of commuting with an arbitrary number if the s € all have this 
property. Now 1 has the property, hence all numbers have it, i.e. (c x ). 

Theorem 4—00 (D B ) (E + ) imply (c x ) (Dl)- 

Proof .—Let be a series of numbers such that, for an arbitrary series of numbers 

(SA>.=SO^j); 

j i 

and let s = 2 s € . Then 

(SA>=S(( 2 A) j «)> by (D e ), 

i *■ j 

= 2E(/^ f ), by hypothesis, 

= 22 foO, by E + , 

i i 

= 2 (pjs ) 9 

5 


by (De). 



452 /. M. H. Etherington 

Thus s has the left distributive property if its summands all have it. Now i has this property, 
and hence by (I) all numbers have it; Le. (D L ) holds and hence by Theorem 3 (c x ). (This 
generalizes a result previously proved for binary algebras; cf.% 2, second footnote.) 

Theorem 5. —(a x ) (c x ) (div L ) are equivalent to (a x ) (e x ) (div Lj B). 

Proof. —Obviously (a x ) (c x ) imply (e x ); and (c x ) (div L ) imply (div B ). Conversely, if (a x ) 
holds, 

ab. cd—ab.c: d^a.bc: d, 

ac. bd=ac.b: d—a.cb\ d; 

if (e x ) also holds, it follows that a.bcz d—a.cb\ d; and if (div^s.) also hold, it follows that 
bc-eb , i.e. (c x ). 


Restricted Arithmetics 

Binary, ternary, . . . , n- ary, otherwise restricted, and finitely free arithmetics can be 
defined, forming closed subsystems of any of the previous arithmetics. With a few minor 
changes the whole of the above discussion applies; we are merely changing the meaning of the 
word series in the definition of addition. 

Thus, suppose that in P s we consider only those numbers derived from 1 by binary addition; 
they correspond to trees which bifurcate at every knot. (In Etherington, 1939 a } I called 
such trees pedigrees.) We are now taking series to mean ordered pair . We obtain an arith¬ 
metic, binary P s , which is the most general logarithmetic applying to a non-commutative 
non-associative binary algebra. The corresponding binary P 0 applies similarly when the 
algebra is commutative; and the corresponding subsystems of Us and II 0, and E^, apply 
similarly when the algebra is palintropic, or is entropic as regards multiplication in its 
polynomial subalgebras. 


Collapsed Arithmetics 

In all the above arithmetics the degree of a given number is uniquely determined and gives 
a homomorphic representation of the arithmetic on A c . Arithmetics which do not have this 
property will be called collapsed. For example, starting with the postulates of P c, let us as 
an extra postulate equate all numbers whose altitudes exceed some fixed integer m. The 
normalized elements of the gametic, zygotic and copular algebras for simple mendelian 
inheritance (Etherington, 1939 b , 1941 a) provide examples where the logarithmetics are 
collapsed and are of this type, with m — o, 1, 2 respectively; so do Lie algebras {m — 1); and 
Boolean algebras show the extreme case (m~o). The arithmetic of ordinary integers modulo 
an integer is an example of a finitely free collapsed arithmetic; the logarithmetic of a cyclic 
group is of this type. 



Non-Associative A rithmetics 


453 


REFERENCES TO LITERATURE 

Birkhoff, G., 1937. “An extended arithmetic”, Duke Math. Journ., in, 311-316. 

-, 1940. Lattice Theory (Amer. Math. Soc. Colloquium Publication, No. 25), New York. 

-, 1942. “Generalized arithmetic”, Duke Math. Journ., IX, 283-302. 

CAYLEY, A., 1857. “On the theory of the analytical forms called trees”, Phil. Mag., XIII, 172-176, 
(Collected Math. Papers , III, No. 203.) (Subsequent papers on trees referred to in Etherington, 
1939 *0 

Etherington, I. M. H., 1939. {a) “On non-associative combinations”, Proc. Roy. Soc. Edin 
LIX, 153-162. (b) “Genetic algebras”, ibid., 242-258. 

-, 1940, 1945. “Commutative train algebras of ranks 2 and 3”, Journ. London Math. Soc., XV, 

1 37 - 1 49; XX, 238. 

-, 1941. (a) “Non-associative algebra and the symbolism of genetics”, Proc. Roy. Soc. Edin., 

LXI, B, 24-42. (b) “Some non-associative algebras in which the multiplication of indices is 

commutative,” Journ. London Math. Soc., xvi, 48-55. (c) “Some problems of non-associative 

combinations (1)”, Edin. Math. Notes , No. 32, 1-6. 

-, 1945. “Transposed algebras”, Proc. Edin. Math. Soc. (2), vil, 104-121. 

MURDOCH, D. C., 1939. “Quasi-groups which satisfy certain generalized associative laws”, Amer. 
Journ. Math., LXI, 509-522. 

Robinson, A., 1949. “On non-associative systems”, Proc. Edin. Math. Soc. (2), VIII. [In press.] 

SCHRODER, E., 1870. “ Vier combinatorische Probleme”, Zeits. Math., XV, 361-376. 

TOYODA, K., 1941. “On axioms of linear functions”, Proc. Imp. Acad. Tokyo, XVII, 221-227. 


(.Issued separately May 20, 1949) 



( 454 ) 


XLVIII.— On Commuting Matrices and Commutative Algebras.* By D. E. 
Rutherford, M.A., B.Sc., Dr.Math., United College, University of St Andrews 

(MS. received January 12, 1948. Read July 5, 1948) 

Introduction 

The structure of commutative associative linear algebras is well known and is usually derived 
from more general results concerning non-commutative algebras (Cartan, Frobenius). The 
novelty of the present treatment is that while it avoids the complexities of the non-commutative 
case, it exhibits the essential relationship between the theory of commuting matrices and that 
of commutative algebras. 

While theorems 1 and 2 of this paper are implicit in the writings of Voss (1889), Taber 
(1890), and Plemelj (1901), it has been considered worth while to recapitulate these results in 
the explicit form required for the discussion of commutative algebras. In doing so, some new 
facts emerge. 

Commuting Matrices 
1. Let A and B be square matrices such that 

AB=BA. (1) 

A non-singular matrix H exists, such that 

HAH^^A^A*, 

where every latent root of A x is cq, and where no latent root of A* is equal to cq. If HBH ” 1 
is partitioned conformally with A x + A*, say 

HBH-'=XB x B' 1 , 

U" B*\ 

then, from the fact that HAH - 1 commutes with HBH- 1 , we can deduce the following 
relations:— 


A X B X =B X A X , A^'^B'A*, B"A X ~A*B\ A*B*=B*A*. 

From the second of these, we conclude that if the reduced characteristic function of A x be 
(x - cq) r 7 then 

B'(A* - a x I*Y=(A x - a x 7 x YB' - o, 

where I * and I x denote unit matrices. Now A* - cq/* is a non-singular matrix, for no latent 
root of A* is equal to cq. Consequently B' = 0, and similarly B" = o. It has therefore been 
shown that a non-singular matrix H exists, for which 

HAH- 1 =A X +A*, HBH- 1 =B X +B*, 

where 

A*B*=B*A*. (2) 

Treating equation (2) in the same way as equation (1), we eventually obtain after a number 
of steps 

_ HAH-^A^A^ . . . +Aj, HBH^^+B ^ . . . ( 3 ) 

* This paper -was assisted in publication by a grant from the Carnegie Trust for the Universities of 
Scotland. 



where 


-On Commuting Matrices and Commutative Algebras 


455 


A,-Bj BjAj, (J —i, . . ., i) (4) 

and where every latent root of A j is a h the roots a x , . . a* being all different. 

Suppose now that for some value of j the latent roots of B s are not all equal. In this case 
we can apply the treatment once more to (4), reversing the roles of A and B. When this 
has been done for every B j} we eventually obtain relations of the same appearance as (3) and 
(4), where every latent root of A 3 - is a,- and every latent root of B i is j8,-. The roots a 1} . . ., a i} 
however, need no longer be distinct. 

Continuing this process still further, it appears that we have the following result, first 
proved by Plemelj (1901). 

Theorem i.— If the matrices A, B, . . D all commute , then a non-singular matrix H 
can be found such that 

HAH~‘ 1 =A 1 + . . . +A i} 

HBH^ 1 =B 1 + . . . +B is 


HBH-i=D x + . . . 4 -A, 

where the matrices A j} B$ , . . ., commute amongst themselves , where every latent root of 

A$ is a,-, every latent root of B 0 * is . . ., and every latent root of D$ is S 3 . 

2. In view of Theorem 1, we now confine our attention to commuting matrices A, B, 

. . ,, D with the property that each latent root of A is a, each latent root of B is j8, and so on. 
A non-singular matrix H can therefore be found such that 

HAH-' = (al u +J U ) + (al v +f v ) 4 - . . 

where I u and J u are matrices of order u of the form 


100 

010 

001 


o o 


1 J 


fu 


OIO 
O O I 


000 
Loo o 


and where u > v > . . . If the rows and columns of HAH-' are now rearranged in the 
order 

I, U+ I, U + V + I, . . 2, U + 2 , U + V + 2, . . ., . . 

and if K denotes the product of the appropriate permutation matrix with AT, then 

KAK-'=X*I n I ma o ... ol, (5) 


a/p 


0 ... 

0 

0 

al q 

Iqr • * • 

0 

0 

0 

al r ... 

0 

0 

0 

0 ... 

ht 

. 0 

0 

0 ... 

al t . 


where [p, q, r, . . t] is the partition conjugate to [u, v, . . .] and fi> q> r> . 
and where I m is the submatrix of p rows and q columns of the form 


. >/, 


Ipq — 


_o _ 


We may now use the obvious notation 

KBK~' = 


Bpy 

Bpq 

-®jw 

- 

Bqp 




B rP 

B rq 

Brr 

. . . 

... 

... 

• • • * 

* * * 

















456 D. E. Rutherford 

to express the fact that KBK~ X is partitioned conformally with KAK~ X . Since KBK~ X 
mTnmntea with KAK~ X , it also commutes with KAK~ X - a I. Consequently, we have the 
matrix equation 


J" ^QX) • • * 

= 

Oj ^vv ^VQ • * * 

-Iqr B rj)) Iqf B rQ ,, Jq r B rr: . . . 


0 ) /qif) . . . 

0 0 0 • • *_ 


L ^tQ ’ * • _ 


Identification of the submatrices in respectively the (i,i)th, (2,2)th, (3,3)th, . . (2,i)th, 

(3,2)th, . . (3,i)th, . . . positions shows that every submatrix below the leading diagonal of 

KBK~ X vanishes. Comparison of submatrices in the (i,2)th, (2,3)th, . . . positions shows 
that 


B 


VP ~ 




( 6 ) 


In the same way it may be shown that 

~ [^<zr ~ * * * (?) 

and so on. 

This result goes a little deeper than Taber’s result, which states that the most general 
matrix which commutes with KAK ~ 1 can be written in the following form 


in which 



For although every matrix which commutes with KAK~ X is of the above form, not every 
matrix of this form commutes with RAK* 1 . In general, certain matrix elements must 
vanish, as we have just seen. Thus the matrices 


a 

0 

0 

1 

0 

o’' 

> 

”0 

0 

0 

0 

0 

0 ~ 

0 

a 

0 

0 

1 

0 


0 

0 

0 

0 

0 

0 

0 

0 

a 

0 

0 

0 


0 

0 

0 

1 

0 

0 

0 

0 

0 

a 

0 

1 


0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

a 

0 


0 

0 

0 

0 

0 

0 

_o 

0 

0 

0 

0 

a _ 


- 0 

0 

0 

0 

0 

0- 


do not commute, although the second is of Taber’s form. 

3- We now prove the principal result on commuting matrices. 

Theorem 2. —A set of commuting matrices A, B, . . ., D can be reduced simultaneously 
by a similarity transformation to triangular form. 

By Theorem 1 it is sufficient to consider the case in which all the latent roots of A are a, 
all those of B are /?, and so on. Omitting the trivial case in which each matrix is scalar, we 
prove the theorem for matrices of order n + i } assuming that it is true for matrices of order not 











On Commuting Matrices and Commutative Algebras 457 

greater than n. The basis of the induction is the fact that the most general matrix which 
commutes with fa il has the form Vy z~\. 

Lo aj [o yj 

It has been shown that a matrix K can be found such that KAK~ X has the form (5) and 
such that 


KBK~ X = 

Brprp 

B m 

A>r 

. . . 

J 

. KDK~ X = 


Byq 

Ar • 

““ 


0 

Bqq 

-Bqr 

. . . 



0 

A 

Ar • 

. . 


0 

O 

B„ 

. . . 



0 

O 

D„ . 

• • 


— . . 

. . . 

. . . . 

• • - _ 



_ . . . 

. . . 


. . _ 


The fact that A, B, . . D commute, implies that A PJ)y B^ . . jD pp commute. So do 
A QQ) Bqqi • • • > J^qq, and so on. By the induction hypothesis, we can find matrices L w Z ff , . . . 
such that Lq A^ Lf\ . . ., L~ x are all triangular, L a A aq Lf 1 , . . ., L q D m L~ x 

are all triangular, and so forth. It follows that if we write 

Z = Zj, 4* Z ff 4* . . ., 

then (LJT)A(LK)'~ 1 , . . ., ( LK)D{LKy x are all triangular. The theorem is therefore 
established. 

The theorem that a set of commuting matrices can be reduced simultaneously to triangular 
form was known to Schur,* but the proof given above affords further information concerning 
the nature of these triangular matrices. Let us express the formulae (6) and (7) in the notation 


Bqq 

I 

e* 

f 

ft 

9 

i? = B nrr 

n>q qt 

Bqy q-r 

_ O 

Brp-q, v-q_ 


_ 0 

Bp-qy q—r„ 


It is clear from the first of these that we may choose Z^ to be of the form 

Zp — Z a 4- B v _ qy 

where L v ^ q reduces B^ Q) 33 _ a to triangular form. Similarly 

L Q — L r + Z c _ r , 


Thus, 


Z« L* 4“ ... 4* L> a _ r 4" Lm—cr 


In consequence, a typical submatrix of {LK)B{LK)~ X such as L p B m L~ x has the form 


or 


[t,:i [ 

[ 


A, 


B n B nr L m 1 , L q B q> Q _ r Lg, J, 


a " qr - 
O 


Zp —5 Bp—qj q—T 


J R';J 

AT-J 


We have therefore established the fact that the submatrices of the triangular matrix 
{LK)B{LK)~ X have all the properties established in § 2 for the submatrices of KBK~ X . In 
addition, since every latent root of B is j8 and since {LK)B{LK)~ X is triangular, this matrix 
must have ]8 everywhere on the leading diagonal. 


Commutative Algebras 

4. In this section we shall employ the summation convention that terms involving 
equal upper and lower Greek suffixes are to be summed from 1 to n, or from 1 to m as the case 
may be. Roman suffixes, however, do not imply summation. In the case of matrix elements 
the upper suffix denotes the row and the lower suffix the column. 

* Schur in his lectures deduced this theorem from certain results of Frobenius, but he attributed the theorem 
to Voss. The present Author has, however, been unable to find an explicit statement of the theorem in the 
writings of Voss, although the theorem might be regarded as a corollary of the latter’s paper of 1889. 






458 Z>. E . Rutherford 

Consider a commutative associative algebra whose basic elements Ui 9 • • , 9 u n have the 
multiplication table (summation from i to n) 

u % u i ~ y| u a . 

The commutative law gives 

r|=r|* 

and the associative law yields 

y^yL—y^yL- ( 8 ) 

As is well known, it follows from (8) that the matrices 

Ti = [y|] 

(where y| is the (£, /)th element of I\) afford a representation of the basis. That is to say, 

iw-yjr.-ivr,. 

Consequently, if H is any non-singular matrix of order n, the matrices 

Ai = NTi #- 1 (i-i, 

give another representation for which 

AyAy = y|A a = AyAy. 

Let AT = [^*], AT" 1 = [A^]. A change to a new basis v l3 . . ., z/ n by means of the trans¬ 
formation 

Ui = h?V a V^E^Ua 

gives 

Hf y% As Ve = 

where 

This new basis has a representation fl 1? . . ., Q n , where 

Of * Kj] - ^y 5 a^] - « ^A 0 . 

Now according to Theorem i (§ i), a matrix H can be found for the commuting set of 
matrices I\, such that the set HY^H ' 1 (or A*) are simultaneously reducible to the direct sums 
of submatrices each of which has equal latent roots. Also by Theorem 2 (§ 3) the matrix H 
can be further specialised to ensure that each of these submatrices is triangular. Since 
Qi=EfA aj the matrices £i £ will retain these properties. The given algebra is therefore a 
direct sum of subalgebras of which a typical one has a basis v h . . ., v m> (m < n) and a 
multiplication table (with summations from 1 to m) 

with the following properties:— 


•f 

II 

3 

(9) 

“a=“>« = • • • =«£, = Pi say, 

(io) 

* 0 if k > f, or if k > j, 

(II) 


(12) 


(9) an d (12) are true because the algebra is still commutative and associative; (10) and (n) 
arise from the special properties of the matrices 

A combination of (10) and (11) yields i mm ediately 

Hi — o. (2 < m) 

From this it follows that if to* denotes the appropriate submatrix of order m x m of the matrix 
Qy, then each matrix o > i3 with the possible exception of co w , is singular. If = 0, then a) m is 



On Commuting Matrices and Commutative Algebras 


459 


also singular and the subalgebra is nilpotent, for any element x = x a v a is represented by a 
matrix x a o) a which is triangular and has zeros everywhere on the leading diagonal. The 
multiplication table of such a nilpotent subalgebra takes the form 

v t v 5 = vpi = a)}^ + . . .+< 4 f *Vi- 1, (* <j<m) (i3) 


in which the coefficients o>| must satisfy (9) and (12). 

Alternatively, suppose Then 00 lU is non-singular. Let (w. W2 ) _1 = [of]. Then, 

using the Kronecker delta, 

Accordingly, by (9) and (12), 


cr>^= cr“co* j3 8/= = o5fi4ufi>p t a 3 = K<*>%o]=u} k mi o} = S |- 


If we now write 

F = < 7 > a 


and choose z; 1 , . . F as a basis instead of z/ x , . . v m then the formulae for the 

products v t Vj (i <m,j < m) remain unchanged, but 

Vv t = c“ v a v t = tr“, 

and 

V'-Vofra-efr.-V. 


It is clear that F is the principal unit of the subalgebra, 
subalgebra now takes the form 

F 2 = F, 

Vvi=*ViV=v h 


The multiplication table of this 


^ = vpi = + . . . + o>(z <j <m~i) 


(14) 


in which the coefficients satisfy (9) and (12). 

To summarise these results, we observe that every commutative associative algebra is the 
direct sum of subalgebras of the types (13) and (14). This is the well-known result (cf. Dickson, 
1930, p. 57). The distinction between types (13) and (14) is of course superficial, for we can 
always adjoin a principal unit to the given algebra, thereby ensuring that each subalgebra of 
the type considered has a principal unit. 


REFERENCES TO LITERATURE 

Cartan, E., 1898. “Sur les groupes bilineaires et les systemes de nombres complexes”, Ann . de 
Toulouse , XII, 17. 

Dickson, L. E., 1930. Linear Algebras, Cambridge. 

Frobenius, G., 1903. “Theorie der hypercomplexen Grossen II”, Sitz. Akad. Berlin , 634. 
Plemelj, J., 1901. Monatshefte fur Math . u. Phys xn, 82. 

Taber, H., 1890. “On the matrical equation ^0 = 0 ^”, Proc . American Acad, Arts and Sciences , 
N.S., XVIII, 64. 

VOSS, A., 1889. “Ueber die mit einer bilinearen Form vertauschbaren bilinearen Formen”, Sitz, 
Bayer, Akad, IViss., XIX, 283. 


(Issued separately May 20, 1949) 



( 4 - 6 ° ) 


XLIX.— Generalizations of a Problem of Pillai. By L. Mirsky, Department of 

Mathematics, University of Sheffield. Communicated by Professor A. G. Walker 

(Revised MS. received August 24, 1948. Read November 8, 1948) 

1. Throughout this paper . . ., k s will denote s > 1 fixed distinct positive integers. 
Some years ago Pillai (1936) found an asymptotic formula, with error term 0 (xj log x), for the 
number of positive integers n < x such that n + k 1} . . ., n+k s are all square-free. I recently 
considered (Mirsky, 1947) the corresponding problem for r-free integers (z.e. integers not 
divisible by the rth power of any prime), and was able, in particular, to reduce the error term 
in Pillars formula. 

Our present object is to discuss various generalizations and extensions of Pillai’s problem. 
In ah investigations below we shall be concerned with a set A of integers. This is any given, 
finite or infinite, set of integers greater than 1 and subject to certain additional restrictions 
which will be stated later. The elements of A will be called a-numbers , and the letter a will 
be reserved for them. A number which is not divisible by any ^-number will be called 
A-free, and our main concern will be with the study of Afiree numbers. Their additive 
properties have recently been investigated elsewhere (Mirsky, 1948), and some estimates 
obtained in that investigation will be quoted in the present paper. 

At various stages of our discussion one of more of the following assumptions will be made 
with regard to A: — 


ar 1 converges. 

a 

(1) 

y'a ” 1 diverges. 

a 

(2) 

There exists a number & (0 < # < 1) such that, for x 00 ,1 

(3) 


a<x 


Any two elements of A are coprime. 

(4) 


We note that condition (1) is contained in (3). 

Numbers of the form af . . . a h \ where h> 1, t x > o, . . t n > o will be called 
c-numbers , and the letter c will be reserved for them. In particular, 1 is a ^-number, and so is 
every ^-number. 

Let C x , . . ., C s be s given, finite or infinite, sets of ^-numbers. Denote by y(n) the 
greatest c-number dividing n, and by 

M-Mfo,. : .,4; A; C x , . . ., C s ) 
the set of all integers n such that * 

yifi +^eC-L, . . y(n + k^)eC 8 . 

Furthermore, let M{pc) be the number of positive integers n < x belonging to M. Our aim in 
§§ 3-6 will be to investigate the behaviour of M(x) for x 00 on the assumptions that A satisfies 
(3) and (4). We shall find an asymptotic formula for M(x ) 9 and shall also prove that M is 
either empty or has a positive density. 

In the remainder of the paper (§§ 7-11) we specialize our initial problem by taking each 
Ct to consist of the single number 1. The set M will in that case be denoted by N and will 
consist of the integers n such that n -f k l7 . . n+k s are all A-free. We shall denote by N(pc) 

* The symbol ZeL means that / is an element of the set L. 



Generalizations of a Problem of Pillai 461 

the number of positive integers n < oc in N, and we shall be concerned with the asymptotic 
behaviour of N(x) under various assumptions made about A. 

I wish to express my thanks to Dr R. Rado for his help in connection with § 8, and to 
the Referee for a number of useful comments which enabled me to make many textual 
improvements. 

2. Our notation is as follows:— 

If AC 7 ?)? Pfrj) are two propositions concerning a variable rj , then 

AC 7 ?) (AC 7 ?)) means that AC 7 ?) holds for every rj for which AC 7 ?) holds; 

AC 7 ?) [AC 7 ?)] means that AC 7 ?) holds for some rj for which AC 7 ?) holds. 

We shall employ an analogous notation when there is more than one variable. 

The letters x 3 y, a , j8, f will denote positive numbers and € an arbitrarily small positive 
number; all other small letters will denote positive integers unless otherwise stated. We shall 

write h~ Max k { . .The highest common factor and the lowest common multiple ofn l3 , . ., n s 
1 <i<s 

will be denoted by (n l3 . . ., n s ) and {n l3 . . ., n} respectively. 

For typographical reasons it is frequently convenient to write / = /'(* m) in place of 
l = V (mod m), and also 

ES | f(m, n, . . .) 

in place of 

n, . . .); 

s 


here S stands for the set of conditions which define the range of summation. 

As usual denotes Euler’s function, din) the number of divisors of n> and tt(x) the number 

of primes not exceeding x. 

Q(n) is defined as 1 or o according as n is or is not A-free; we also define a>(n) = o if a 2 1 n [a ] 9 
and co(n) = (- 1)* if a 2 Jf n (a), and n is divisible by precisely A > o distinct ^-numbers. 

D(m) denotes the number of numbers among 1, 2, . . m which are congruent (mod m) 
to at least one of h l3 . . k S3 i.e. the number of residue classes (mod m) represented by 

^ s . 

We define P(n l3 . . n s ) as 1 or o according as the system of simultaneous congruences 
in n 3 

n + A { = o (mod n t ), (1 < i < s) (5) 

is or is not soluble; moreover, T(x; n 8 ) denotes the number of positive integers 

n < x which satisfy the system (5). 

We write 

_ n + ki = c i m i | 

P(x; a) =F(x; a; k l3 . . k s ) = < x; ^ < , < ^ ; c 1 . . . c 9 > x J1, 

N(x, y) — Yin < x; aJf(n+h z ) (1 < i < s; ^<j^)|i. 

A set of integers B will be said to have density a if, for x —> <x> ? Yin < x ; neB\i ~ ax. It 
will be said to have logarithmic density a if 2 n < x; neB\ n~ x ~ a log x . 

The (9-notation will be used in the following sense. If <t> and Y are two functions of x 
and certain other parameters, say A 1? . . ., A r , then O = 0 (Y) means that there exist positive 
constants K 3 x 0} depending at most on k l3 . . ., k S3 A, C x , . . C s , a, fi, e, q 3 such that for 
x > x 0 and all (or all specified) values of A 1? . . ., A r we have | O | < JTY. If 0 and Y do 
not depend on then the question of the existence of x Q does not, of course, arise. 

3. We begin with some preliminary lemmas. 

Lemma i .—The system of congruences (5) is soluble if and only if (n i3 n 3 ) | 

(1 < i <j < s). In the case of solubility the solutions form precisely one residue class 
(mod {n l3 . . n s }). 

For a proof of this well-known result see Scholz (1939, p. 49)- 



Lemma 2. 



462 L. Mirsky 

This is an immediate consequence of Lemma 1. 

E(n u . . n s ) J 1 \ 

Lemma 3. -7 -r = 0 [ -)* 

For a proof see Mirsky (1947, Lemma 3). 

Lemma 4 .—If A satisfies (1) (4), the multiple series 




. . ., ^ s ; eC £ 


i « • ■ 1 


0 )(^) . . . a>(V 5 ) 


-ffiflfl ? * • ■ ? Va ) 


converges absolutely . 

Since A satisfies (1) the product II (1 - a” 1 ) -1 converges, and therefore, by (4), V<r* x 

0 c 

converges also.* But, by Lemma 3, 

f v y , « • « ? _I_\ 

w(£d ' • • " w v.*} Wi' ■ • • v.7’ 

and the assertion therefore follows. 

Lemma 5 .—If A satisfies (3) and (4), then 

2*1 • • • ^>*|-—=0(«r 1+<>+e ), 

2^. . . <r s <*|i = 6>(#* +€ ). 

For a proof of these two estimates see Mirsky (1948, Lemmas 9 and 10). 

4. We next come to the crucial lemma required in the study of the function M(x). 

Lemma 6 .—If A satisfies (3) and (4), then 

F(x; a)~ 0 (x l -« 1 ~V + ^ + 0 (x 2 W 1 + V + *). 

The proof is by induction. For ,r = 1 we have, by Lemma 5 and (3), 

F(x] a) = E*z < x; n + h 1 = cm; c > x a \i 

^'Zx 0 < c < x+kxllZn < x; n + k x = o(* c)\ 1 

= 2# a < + + 

= Oix 1 - «a-0+«) + +e) # 

Thus the assertion holds for s~i. Assume next that it holds for s -1 where s > 2. ' Denoting 
by f 3 a number whose value will be fixed later, we have 

F(x;a)=F(x; a; k 1} . . .,*,) = ]?><*; ***T^*; * . . . * s >*“|i 

< y* < *• n+k i =c i m i. *1 . . . <r 3 > x a J 

^ ’ (x < i < s) ’ c x . . . cjcj < x\x <j < *)| 

1 <K* (l<i<s)’ C 1 . . . cjq > xf>\ 

=F 1+ 2 W (6) 

l<i<i 


n(i 


We have, in fact. 



Generalizations of a Problem of Pillai 463 

say. Now, by Lemmas 2, 3, and 5, 

F x < 2* < x > * 1 + < | . fy % a < ■ • ■ c 3 < 1 

^... *. < ^ 4 ) 

= 0(i!c 1 - a ( 1 -^ +e ) + 0(*W(*-!)+«). ( 7 ) 

Furthermore, 

< 2 * < *; **<,•<” J ’; c i ■ ■ • ^ > **| 1 

= 2 * < *; (l <* < sfi*jy Cx • ■ • > **| 

< x > ^ < j < j. /^y)> > 

-o(**2><* ; (I ^^) ; • • £ ^ >x ^) 

= 0 {x P(x\ ft', &i, . . ., kj_ j_, • • »j 4 )}- 

Hence, by the induction hypothesis, 

^ = < 9 (^ 1 -' 5(1 -^ )+€ ) + 0 (x 2&la+ ^ +€ ). (8) 

Putting f}=(s-i)/(s-x+&) we obtain, by (6), (7), and (8), 

a) = 6>(^ 1 " a(1 "^ )+€ ) + 0 (x* sl(s ~ 1+ * )+€ ) + 0(^ 2W+d)+€ ), 

and the lemma now follows since &s/(s -1 +#) < 2^/(1 +#) for s > 2. 

5. We are now in a position to obtain our main result concerning the asymptotic behaviour 
of M(pc). 

Theorem i.— If A satisfies (3) and (4), then 

M(x) = ax + 0 (x™K 1 + V +€ ). 

Throughout this proof it is assumed that C l9 . . c s 'eC s . For given values of 
c{, . . c s ' we clearly have y(n + k-^ = c^, . . yin + h s ) = c s r -if, and only if, 

ex I (» + *d, ■ ■ ■> ei | (*+£>; • • • o(~r)-x. 

Now we know [Mirsky, 1948, equation (14)] that 

C 1?1 

and therefore 

2 n+kt^CfCi'mA 

n<X ’ (i<i<s) |“^ • • * “W 

= S 1 + S 2} (9) 

say, where c x c-[ . . . c s c s ' < x in S x and c x c{ . . . c s c s ' > x in S 2 . Now, by Lemmas 2, 4, 
3 , and S, 


Sj — ^ ^ Ys) F(cc j y ‘ • * J ) 

=<w + 0(** +e ). 



464 L * Mirsk y 

Furthermore, 


|s a |< 2 * 

< x; 

n+ki- 
(1 < 

= , 
i<s) 5 ^ • 

. . c s c s ' > 

*| 1 


=Z« 

< x; 

n+ht* 

(i< 

= c?m i 

i<s) ; Cl ‘ - 

. cf > (rj 

^ c i c l ■■ 

"(x <i 

= '<* | 
:<s)V 

< 

< x; 

n+ki 
(1 < 

i < s) ’ 1 

, . c*>x 

1 ^ 1 *) . 

. . d(c*) 

= o(x‘ 

Z> 

n + ki-c^mi 

<x\ , ^ ^ N ; c 

(1 < 1 < s) 

1 • • * 

> 3 c|x) 



= 0 {x € F(x; 1)}. 


(it) 


Hence, by Lemma 6, 

S 2 ~£(* 2 W + *> +€ ) } (12) 

and the theorem now follows by (9), (10), and (12). 

It is, perhaps, worth noticing that Theorem 1 is only of interest when s > 2. When s~ 1 
it can be sharpened and the expression for the constant or simplified. Indeed, if C 0 is a 
prescribed class of ^-numbers, it is easy to show that the number M 0 (x) of positive integers 
n < x such that y(«)eC 0 is given by 

M 0 (x )=( 2 ;)n( i - -)* + 0(x° +e ). 

\eeO, Cj a \ <*/ 

We may also note that any improvement made in the critical Lemma 6 will result in a 
sharpening of the error term in Theorem 1. For suppose the relation 

F(x\ i) — 0 (x^ +€ ) 

is valid for some#' < 2#/(i +#). Then, by (9), (10), and (11), we obtain 

M(x) = cxx + 0(x e+€ ), 

where 8 =Max(d',d'') < 2#/(i +#); this would clearly constitute an advance on Theorem 1 
provided that # > o. 

6. Theorem 1 shows, in particular, that the set M has the density cr, but it gives us no 
information whether this density is positive or zero. This question will be settled now. 

Theorem 2. — If A satisfies (3) and (4), then M is either empty or has a positive density. 

In view of Theorem 1 it is clearly sufficient to show that if M is not empty, then 


M(x) > ax (13) 

for some a > o (independent of x) and all sufficiently large values of x. As (13) will be needed 
again in § 11, we note that it will be established without making use of (4), and replacing (3) 
by the weaker condition (1)/ 

Throughout the proof x is taken to be sufficiently large. If M is not empty, then there 
exist numbers c x> . . c s> n Q such that ^eC € , and 


Take £ > o such that 


y(«0 + ^1) 




(14) 


a>£ & 


< 


I 

2S 


(is) 


Denote by a 1} . . a r the ^-numbers not exceeding £, and write c^ — c t . . . c s a x . . . a r . 
Let D be the set of all numbers n given by n = n 0 + c 0 t (t= o, 1, 2, . . .). Also let, for 1 < i < s 
and any a, 


Si(x, <z) = S« < x; «eD; ac t \ (n + h % ) |i. 



Generalizations of a Problem of Pillai 465 

By (14) it follows that the congruence in t y 

c 0 t = -n 0 -ki (mod ac^ y 

is soluble if, and only if, {ac iy c 0 ) | c t . Hence it is certainly insoluble for a < £, while for 
a > £ its solutions, if any exist, form precisely one residue class (mod a). But 


and therefore 


Also trivially 
Using (14) we have 


Siipc, d) = Y.o<i< (,x-n 0 )/t 0 ; ctf ss 



S t (x, a) = o 

(a < g), 

(16) 

oc 

Si(x, a) < — + 1 
ac 0 

(a > g). 

(17) 

Si(x, a)= 0 

(a > x + k). 

(18) 


M(x) > 2 ^ ^ x > /j. 


(1 < i < s) 


> 2* < x \ neD 2 Si{x, a) y 

a l<i<s 


and therefore, by (16), (17), (18), and (15), 


M{x)>-^-^<a< X + k\ 2 £- + * 


x ~n a x , x 

> - * — ~sA(x+k)> — 

Cq 2Cq 3^0 

since (1) implies A(x)-o(x ). 

7. From now on we shall be concerned with the function N(x) defined in § 1. We begin 
by observing a simple consequence of Theorem 1. 

Theorem 3. —If A satisfies (3) and (4), then 

N(x) = rx + 0 (x 2 W 1 + V +€ ) y (19) 


where 


r = n(i ~JD(a)a-*). 
a 


The convergence of the product expressing r follows, of course, trivially by (1). To prove 
Theorem 3 we take each C i in Theorem 1 to consist of the single number 1. We then obtain 

N(x) = f x + 0 (x 2 W l+ V +€ ), (20) 

where 

r I / \ / * * •> 

r ~2 j c i> • * X-7p 

I UU? * * • jr 


the identity t' = t can now be established by means of a generalization of Euler’s identity 
for multiplicative functions (Mirsky, 1948, Lemma A), the argument being straightforward 
though somewhat lengthy. This argument can, however, be avoided, and the identity t'— r 
will follow at once by (20) and Theorem 4. 

A particularly interesting special case of Theorem 3 arises when A is taken to be the class 
of rth power of all primes, where r is any integer greater than 1. N(pc) then becomes the 
number of n < x such that n+k ly . . n+k s are all r-\ free, and we obtain the asymptotic 
formula referred to in § 1 (Mirsky, 1947), namely 

N(x) =n(i- Dl&p-^x + 6>(^ r+1 >+*% 

2> 

8. Our next problem is to investigate the asymptotic behaviour of N{x) when (3) is replaced 
by the weaker condition (1). We shall see that the formula (19) continues to be valid provided 
that the error term is suitably modified. 



466 


L. Mirsky 


When (i) is satisfied we shall write 


#(*)=£-• 
a>a 


Theorem 4.— If A satisfies (1) and (4), then 

N(x) = tx + 0 {xH(\ log x)} + 0 (*i). 

Let y be a function of x, to be chosen later, such that y » as # -> 00 , and take x to 
be sufficiently large throughout the proof. 

Some of the results found below will be used again in later sections when the conditions 
imposed on A have been varied; we shall, therefore, note under what conditions each result 
we deduce is valid. 

First let A satisfy (1). We then have 

o < N(x, y)-N{x) < < x; a | (n + A t ) [1 < i < s; a>y]\i 

<22 a \(n+k >)|i 

a>y 1 

-2> <«< *+* | 2 o {o(f)+o»} 

= 0 (xff(y)) + 0 (A(x)). (21) 

Moreover, 

£ #(*) 

n<® o<ra<$ 

= £#(*)+ 2 00 )+ 00 *^ 0 )), 

o<»<y y<n<,z 

and we therefore see by (21) that, if A satisfies (1), then 

N(x) =N(x, y) + 0 {xE(y)) + 0 (y). (22) 

Next, let A be subject to (4). For a given a the numbers n such that a jf (n+i t ) (1 < i < s) 
form precisely a-Did) residue classes (mod a). Hence, by Lemma 1, the numbers n such 
that a X (n+fy) (1 <i<s; a<y) form precisely II (a-D{a)) residue classes (mod II a). 
Therefore a<y a <v 

N(x, y) = x II (1 - D(a)a- 1 ) + 0 ( II a) 
a<y a<y 

=x n (1 - D(a)a- v ) + 0 (y Aiv) ). (23) 

a<y 

But, since 1 < D(a) < j (a), we have, if A satisfies (x), 

II (x - D(a)ar J-)- 1 = exp / 2 , ~ lo g 0 - - 0 («)« -1 )} 
o>y \~ v ) 

= ex P { 2 < {0(H(y))} 

= i + 0 (ff(y)), 

and therefore, by (22) and (23), we see that if A satisfies (1) and (4), then 

JV(x) — tx + 0 (xH(y)) + 0 (y (24) 

But, if A satisfies (4), then 

A(y) < ir(y) < zyjlogy. (25) 

Hence, by (24), if A satisfies (1) and (4), then 

N(x) = tx+ 0 (xH(y)) + 0 (e iv ). 

The assertion now follows if we take y =(log x)\\. 



Generalizations of a Problem of Pillai 467 

We conclude this section by mentioning some special cases of the problem just considered. 
If the class A is finite, then the integers n such that n + h ly . . ., n+k s are all A-free form 
precisely II (a-JD(a)) residue classes (mod Ua). Hence N(oc) is given by 

a a 


N(x) = tx + 0(i). 

Again, if A satisfies not only (1) but also the stronger condition (3), then we obtain by 
partial summation from (3) 

H(3c) = 0(pr 1+ % 


Hence, by Theorem 4, 

N(x) — tx + 0 {x (log xf 1+1? }. 

A slightly sharper result can be deduced from (24). For we then have 

N(oc) = tx + Oixy" 1 + ^) + 0(jy ay& ), 

where a is a number independent of x. Putting y = j 8 (log x/log log x) 1 ^, where < #/2, 
we obtain 


N(x) — TX + 0 {x (log log x/log x) a ~^ )/& }. 


However, this formula is still inferior to (19). We therefore recognize that the method of § 5 can 
deal with fewer cases than the method of the present paragraph, but that when it is applicable 
it yields much sharper results. 

9. We already know by Theorem 2 that when A satisfies (3) and (4) N is either empty or 
has a positive density. This result will now be extended to the case when (3) is replaced by 
(1), and we shall, in fact, obtain a criterion for deciding whether N is empty or not. 


Theorem 5 .—Let A satisfy (1) and (4). If for every a , there exists some n such that 

n + ki^o (mod a) (1 < i < s), (26) 

then N has a positive density. If on the other hand , (26) cannot be satisfied for some a , then 
N is empty . 

It is clear that a necessary condition for N to be non-empty is that (26) should be satisfied 
for every a and some n\ Theorem 5 shows that this obviously necessary condition is also 
sufficient. 

The proof is now almost trivial. If (26) is satisfied for every a and some n y then clearly 
D(a) < a (a), and therefore r > o. Hence, by Theorem 4, N has a positive density. On the 
other hand, if (26) cannot be satisfied for some a , then N is obviously empty. 

10. If condition (1) is replaced by (2), i.e. if it is assumed that y, ar x diverges, the problem 

a 

of estimating N(x) naturally becomes much more difficult, and we are at present only able 
to obtain an upper estimate of N(x). 

We shall write V(x) = y, -• 

' a<;c a 

Theorem 6 .—If A satisfies (2) and (4), then 

N(x) = 0 {x exp (~sV (log #))}. 

Denoting by y a function of x to be chosen later, we have, by (23) and (25), 

N(x) < N{x, y)=x II (1 -JD{a)a~ 1 ) + 0 (e %y \ 
a<y 

But D{d)=s for a > h, and so 

= 0 {x exp ( - s V(y))} + 0 {e™). 



468 

Furthermore, 


L. Mir sky 


"tw>-2;+ 2 2.; 


a<y w y<a<iy 


y<n<4y 


Therefore 


= V{y) + 0 (i). 

N(x) = 0 {* exp ( - $ F(4y))} + 0 (e 2y ), 
and putting^=(log x)/4 we obtain 

iV(x) = 0 {x exp (-sF (log x))} + 0 (x^). 


But 


and so, 


V(x) < 2 ~ < 2 log 


n<!T 


x exp ( -sV(log x )) > ^ exp (- 2s log log x) ~x (log x)~ 2 * > xK 


This completes the proof. 

We observe the following consequences of Theorem 6, valid for a set A satisfying (4):— 
If A(x) > ax/log x for x > j 3 , then N(x) — 0 {x (log log x)~ sa }. (27) 

If A(x) ~ ax/log x , then N(oc) — 0 {x (log log #)- sa+e }. (28) 

If A(x)— ax/log x + 0 (x/ log 2 x), then N(x) = 0 {x (log log x)~ sa }. (29) 

To prove these results we note that if A(x) > ax/log x for x > j8, then 
rr/ x ^A(n)-A(n-i) „ A(n) A(x) 

() "i§,'-»- b0) 


> 2 / . ?, -= alog log x + 0 (i). 

Hence (27) follows by Theorem 6, and (28) is an immediate consequence of (27). Again, if 
^(x)=ax/log x + 0 (x/log 2 x), then, by (30), F(x) — a log log x + < 3 (i), and (29) follows by 
Theorem 6. 

As a final application of Theorem 6 we note that if A is the set of all primes congruent to 
l (mod q), where (/, q) = 1, then 

N{x) = 6>{x(log log x)"^}. 

This follows in view of the well-known relation 

^t X K-?)\rw) loeloex+0iI) - 

n. So far we have always assumed that any two ^-numbers are coprime. In this section 
we shall drop this restriction but shall suppose that 2 ,#” 1 converges. What can then be said 

a 

about N? This question is a generalization of the question considered in §§ 8-9, but since 
our present restrictions on A are very much less severe only a weaker result can be expected. 
We shall, in fact, find an asymptotic formula for N(x), but we shall not be able to obtain any 
estimation of the error term in this formula. 

The corresponding problem when condition (1) is also dispensed with is even more difficult* 
Besicovitch (1934) has shown that in that case N need not possess a density. In the opposite 
direction it was proved by Davenport and Erdos (1936) that for s =1 N must have a logarithmic 
density. It would be interesting to extend their result to any values of s. 

Theorem 7.— If A satisfies (1), then N is either empty or has a positive density . 

Let y be a function of x, to be chosen later, such that y 00 as x —> 00 . We have, 

by (21) and (1), 

N(x)—N(x, y) + o(x ). ( 31 ) 



Generalizations of a Problem of Pillai 469 

Let r~r(x) be the number of ^-numbers not exceeding y, and let these numbers be denoted 
by a 1} . . ., a r . Furthermore, let r r be the density of the set of positive integers n such that 

a i \ (n + fy) (1 < i < r; 1 < i < s). 

By Lemma 1 this set consists of certain residue classes (mod {a 1} . . a r }), and therefore 

N(x > y)^r r x + 0 (a 1 . . . a r ). (32) 

Choosings such that a x . . . a r = o(x ) we have, by (31) and (32), 

N(x)=*t t x + o(x). ( 33 ) 

Now clearly r r > r r+1 > o (r > 1). Therefore Lim r r exists. Denoting it by r # , we have 

r—> 00 

by (33), since r -> 00 as x —*■ 00 ? 

— t*x. * ( 34 ) 

Finally, we make use of (13) in the case when each C* consists of a single number 1. Hence, 
if A satisfies (1) and N is not empty, then N(x) > ax for some a > o (independent of x) and 
all sufficiently large values of x. It therefore follows that either N is empty or (34) holds 
with some r # > o. 

It is worth observing that the inequality r* > o can be proved by a very simple and direct 
argument for s= 1. In that case 

• •’ r-0 ’ I kv . 


and therefore, by an inequality proved independently by Heilbronn (1937) and Rohrbach 
C 1 937)5 we have 


r r > n (i 

l<t<r\ 



Hence 


t* = Lim r r > II 

r—4-00 ft 



O. 


REFERENCES TO LITERATURE 

BESICOVITCH, A. S., 1934. “On the density of certain sequences of integers ”, Math. Ann., CX, 

336-341. 

Davenport, H., and Erdos, P., 1936. “On sequences of positive integers”, Acta arithmetica , 11, 

Heilbronn, H., 1937. “On an inequality in the elementary theory of numbers”, Proc. Camb. 
Phil. Soc ., XXXIII, 207-209. 

MlRSKY, L., 1947. “Note on an asymptotic formula connected with jr-free integers”, Quart. Journ. 
Math. (Oxford), xvm, 178-182. 

-, 1948. “The additive properties of integers of a certain class”, Duke Math.fourn XV, 513-533. 

PILLAI, S, S., 1936. “On sets of square-free integers ”, Journ. Indian Math. Soc., II (N.S.), 116-118. 
ROHRBACH, H., 1937. “ Beweis einer zahlentheoretischen Ungleichung”,/<?«r^./wr Math., CLXXVil, 

193-196. 

SCHOLZ, A., 1939. Einfukrung in die Zahlentheorie , Berlin. 


(.Issued separately May 24, 1949 ) 



( 47° ) 


L.— Quantum Theory of Rest-Masses.* By M. Born, F.R.S., and H. S. Green. 
With Appendices by K. C. Cheng and A. E. Rodriguez, Edinburgh University. 
(With Two Text-figures.) 

(MS. received December 7, 1948. Read February 7, 1949) 


1. Introduction 


It has been acknowledged for a long time that current quantum theory is incomplete. The 
difficulties and unanswered problems which have gradually become apparent during the 
development of the theory will not be discussed here, and only one aspect of the situation will 
be mentioned, that there seems to exist a large number of particles with different rest-masses, 
the numerical values of which demand a theoretical explanation. The experimental material 
has recently been greatly increased by the discovery of several kinds of mesons with different 
rest-masses. 

It is well known that the mass /r of a particle can be replaced by an equivalent length 
which is given by the formula /=—, where li is Planck’s constant and c the velocity of light. 

jJLC 

Instead of speaking of different rest-masses, one can therefore say that each particle has a 
characteristic length which may be a numerical multiple of a certain absolute length 0. The 
concept of such an absolute length has been suggested and developed by Fiirth (1929), Bom 
(1934, 19380), Heisenberg (1938, 1943), March (1938, 1947) and Snyder (1947) among 
others. However, the introduction of the absolute length has led to difficulties in connection 
with relativistic invariance, which Is essential for any theory to be applied to fast particles. 
Some new principle is clearly required, which, to conform with the tendencies of modem 
physics, must be expressed by postulating the invariance of the fundamental laws under 
some kind of transformation. 

Invariance under one such transformation is already apparent in the accepted laws of 
physics. The fundamental classical laws were expressed by the formulae 




(1.1) 


giving the time variation of the co-ordinates x jc and associated momenta p k in terms of 
derivatives of the Hamiltonian function H. These laws remain unchanged if x k is replaced by 
pk and/*, by -x k . The same symmetry appears in the fundamental commutation laws 


a* i -*v*=*w, v={° lt l i) 



(1.2) 


of relativistic quantum mechanics, in the Fourier transformation connecting wave functions 
and representatives of dynamical variables in the co-ordinate and momentum representations, 
and also in the formal expression 

wm^xicfi-xipk (i*3) 


for the important angular momentum tensor. If one introduces the fundamental length 0 
and the corresponding momentum then Planck’s constant is expressed as the symmetric 
product % — ab. When 0 and b are used as units of distance and momentum, the equations 
(1.2) and (1.3) can be written in a simple dimensionless form. 

* This paper was assisted in publication by a grant from the Carnegie Trust for the Universities of 
Scotland. 



Quantum Theory of Test-Masses 471 

All these considerations^ strongly suggest that it should be possible to represent every 
fundamental physical law in this dimensionless symmetric way, but this has never been 
universally accepted as a general principle. It was, in fact, suggested some time ago by one 
of the authors (Born, 1938) that the symmetry between co-ordinates and momenta has a deeper 
significance than generally appreciated, but attempts to obtain new results from this “ Principle 
of Reciprocity ” (Lande, 1939; Born, 1939; Bom and Fuchs, 1940, 1941; Fuchs, 1940, 
1941; Sarginson, 1941) led to nothing of practical importance. 

It has indeed become clear that a change of attitude towards quantum theory is required. 
The current quantum theory proceeds from a set of field equations for each kind of particle, 
which can be derived from the Hamiltonian H or, better, by a variational principle from the 
Lagrangian L for the corresponding field. The form of the Lagrangian is determined partly 
by application of the correspondence principle from classical mechanics, and partly by making 
use of quantum-mechanical considerations, such as spin, derived from observation. For 
every kind of particle one has therefore an essentially empirical Lagrangian function Z, the 
structure of which indicates the spin, while the only numerical constant appearing is the 
rest-mass. The problem to be solved has hitherto been that of finding the wave function ip, 
which determines the physical characteristics of the particle. As soon as the rest-masses 
themselves become the object of interest, however, this standpoint is inadequate, and the 
new problem consists in the finding of the function L itself from a general principle. The 
object of this paper is to show that the principle of reciprocity is a powerful tool for restricting 
the choice of L and producing those Lagrangians which correspond to observed particles. 

This being accepted, the whole of quantum mechanics separates into two distinct fields: 
the first the determination of the different possible Lagrangians and the corresponding rest- 
masses for free particles (and the coupling between them, which will be given only slight 
consideration here); the second the derivation of the wave functions from the Lagrangians 
for a given experimental arrangement. 

Before the first step could be taken a considerable amount of work has had to be done, of 
which the beginnings appear in another paper (Green, 1949), but which will be developed 
here in full. The Lagrangians which are obtained by the principle of reciprocity have a very 
general form, containing all derivatives of the wave function with respect to space and time. 
Such Lagrangians have been considered by Chang (1946, 1948) and de Wet (1948), whose 
calculations are, however, so complicated that it is impossible to apply them in practice. A 
definite meaning has to be given to the particle density, energy density, and the corresponding 
flux densities, and second quantisation applied so as to derive the particle aspect of the wave 
field. 

In the first section following it is shown how the Lagrangians are to be derived from the 
Principle of Reciprocity, and this is in turn followed by the general field theory and second 
quantisation procedure which are independent of the actual Lagrangians used. The method 
is then applied in detail to the determination of the masses of particles with spin zero or one, 
and it is shown that among the most stable particles occur some which have masses in close 
agreement with experimental determinations of the meson masses. There exist also particles 
of spin one and zero rest-mass which may be interpreted as photons. Finally, in the first 
appendix, particles with spin half are considered, and it is found that they include mesons, 
as well as particles with vanishing rest-mass which may be interpreted as electrons or neutrinos, 
according as they are charged or not. The view is taken that the mass of the electron is wholly 
electromagnetic in origin, in conformity with the most recent expectations of quantum electro¬ 
dynamics. 

2. Self-Reciprocal Wave Operators 

Thr oughout this paper the notation of the theory of relativity is used, so that a relativistic 
affix k or l assumes the values 1 to 4, and is summed from 1 to 4 in any expression in which 
it occurs repeated. Also, for any four-vector 

z k — ( z l) z 2j z 3) 

* l =g l1cz k, z k=Ski*\ 
g lk =gn = ° =1 4), 


= -1 (£=/=i, 2, 3), 


(2.X) 



472 M. Born and H. S* Green 

so that g kl is the metric tensor of Galilean space-time. It will usually be convenient to confine 
attention to a finite volume £1 of real space, so that each component of the momentum of a 
particle has the discrete proper-values 2 irnkOrh, where n is an integer, but the extension to 
infinite space when 00 is trivial. 

Commencing with the consideration of particles, like the photon and certain kinds of 
meson, which have integral spin, the generally accepted wave equation for these particles has 
the form 

/j £ 2 \ / jg\ g 

\ 7 {] p > ~J % 55 ( x > ct )> ( 2 - 2 ) 

where is the wave function, which may be a scalar or a vector, and /a is the appropriate 
rest-mass. By expressing x k in units of a fundamental length a , which may be taken to be 
the classical radius of the electron, and/ fc in units of the corresponding fundamental momentum 
b-H/a, the equations (2.2) are reduced to the dimensionless form 


where 


j>hP’ c 'P(xi) = K 2 'P( x i), 



fxc fica 


( 2 - 3 ) 


( 2 - 4 ) 


in terms of the customary units. The most general solution of (2,3) is known to be 

(a.S> 

■ 

where the p{ are constants connected by the relation= and the $(p{) are also (scalar 
or vector) constants. The same wave function (2.5), however, also satisfies the equation 

■F(PiP l )K x k) = °> (2-6) 

where F(p x p l ) is any function of the form 

F(hf) =^i (Pif)(PkP lc “ * 2 ), ( 2 -7) 


and if F x (z) = o has no root, (2.5) is the only solution. If, however, F x {p x p x ) is itself of the form 
F^PiP^PiP 1 ~ K 1), then the equation (2.6) will have solutions corresponding to particles of 

rest-mass ^ = — also; and it can be seen quite generally that a wave equation of the type 

(2.6) may have solutions corresponding to particles with any number of different rest-masses. 
The wave equation 

a k p k ift(xi) = Kifj(x t ), a k a x + a x a k = 2 g kh (2.8) 


of the electron, or any other particle with spin half, can in a similar way be shown to have 
solutions satisfying the equation 

( 2 * 9 ) 


where F(a k p k ) is any function with a factor a k p k - k ; and, though this is not necessary, (2.9) 
mayhave solutions corresponding to particles with different rest-masses from those originally 
considered. In the same way, the equation 

^(A)M) = ° ( 2 * IO > 

for any spin may characterise particles either all with the same rest-mass or with many 
different rest-masses. 

It is obvious that the choice of the function F(p^) is a priori very arbitrary. For any given 
spin, the requirement that it should be relativistically invariant imposes a considerable limita¬ 
tion on the form of F , but some further restriction is still required in order that it should 
provide a description of the particles which are actually observed. Such a restriction, is 
provided by the Principle of Reciprocity, which imposes on F the condition that it should be 



Quantum Theory of Rest-Masses 4 y 3 

reciprocally invariant as well as relativistically invariant. A scalar reciprocal invariant is 
defined as a function which satisfies the equation 

^( x 7o fk)^( x l) = s F{ x l), * (2.1l) 

where 


*S( x 1 o Pic ) ^(.Pk ? x Je)‘ 


( 2 . 12 ) 


From this symmetrical property S[x ki i—fj^S^ i~j of what will be called the metric 


operator S, it follows that the equation (2.11) will still be satisfied when F(x t ) is replaced by 

w 

To demonstrate the connection of this definition of reciprocal invariance with Bom’s 
original definition (1938) in terms of the Fourier transformation, it is sufficient to consider the 
equation 

J?(x) = ( 2 Tr)-^jF(p)e- i *‘‘dJ> (2.13) 


defining a reciprocally invariant function of one variable x. If F m (pc) is any solution, one has 
jbFJpc) =i^FJx) = 

(2Tr)f{xF m (J>)}e~ ipx dp = - (2TT)-l^^ m (J>)^e- il,x dp = - xFJx), (2.14) 

by integration by parts. Hence 

' S{x, p)FJ?) = p)F m {p)}e~™dp 

= -x)F m (p)}e-^dp, 

on the assumption that S satisfies (2.12), i.e. 

s(x, i£y m (x) = (2n)f{s(p, ^F m (P)Je-^dp. 


(2-iS) 


From this it is evident that Six, i—)F m (x ) is a linear combination of the solutions F n (x )-of 

(2-13), V X 


S(x, p~)F m — S mn F n . 

n 

or, making the matrix S mn diagonal, 

S(x,p)F~sF. 


(2.16) 


The theorem just proved shows that for the determination of F it is sufficient to proceed 
from the simplest relativistically invariant function satisfying (2.12), which is 

$ = XyXp + p k p h . (2.17) 

This may be used to determine the scalar, vector and tensor reciprocal invariants. As shown 
by Dr K. C. Cheng in Appendix I to this paper, for the determination of spinor and other 
more complex reciprocal invariants, the appropriate metric operator S may be defined to 
satisfy 

S(x k ,p !c ) = T- 1 S(p k , -x k )T, (2.18) 

where T is a unitary operator not involving x k or p k . Multiplying (2.11) by T before and T~ x 
afterwards, one then obtains 

s(i^ w - x^TFix^T-' = TFixJT-h, ( 2 . 19 ) 



M, Born and H. S. Green 


474 

or, replacing x k by -p k throughout, 

S(x k , p^TF{p l )T~^ = mp^T-h. (2.20) 

The reciprocal of F(x k ) is then TF{p 1 ^)T~ l . 

The reciprocally invariant wave operators F will be obtained by the detailed solution of 
the equation (2.11) in the later sections of this paper. Before this is done, it is necessary to 
inquire more closely into the observational properties of systems described by wave equations 
of the type (2.10). In particular, proof must be given that the modification of the original 
wave equation (2.2) has not invalidated the interpretation of the wave field in terms of an 
assembly of particles with different rest-masses given by the factors of the wave operator in 
the way suggested by the early considerations of this section. Now the properties of wave 
fields are most conveniently derived from the Lagrangian function, and it is important, there¬ 
fore, to be able to construct this function from a knowledge of the form of F(p k ). For linear 
fields of the type here considered, the Lagrangian is a linear combination of the statistical 
operator 

P(Xk,Xk') = 'l'(Xk)'[>*(Plc'), (2.2l) 


and its derivatives of all orders with respect to x k and x k . Writing for brevity 
p m = pi 1 p‘%*pf i p™ i , etc., the Lagrangian may therefore be represented in the form 


L{x k , x k ') = ^c mn p m p' n p(x k , x k ’)=F(p u Pi)p(x h , x k '),' 

mn 

3 3 

The field equations result from the variational principle 

x k )d£ldt—o, 

and by the well-known procedure one obtains easily from (2.22), 

^mnP m+n ^lc) = -F(Pl, Pl)$(Xjc) = O. 


(2.22) 


( 2 * 23 ) 


It is therefore clear that the function F(p k < j of (2.10) must be identical with F(p k , p k ), where 
-F(Pk>Pk ') is defined by (2.22). 

The construction of the Lagrangian function from the wave operator F(p^) is now obviously 
not a unique procedure; however, it will appear in the next section that the observable 
properties of the field do not depend on the way in which it is done. For F^F{p k p 1< ) and 
F=F(a k p h ), the most convenient choice is to take F^F(p k 'p*)p and £ ~ F{$a k (p h +P kf )}p 
respectively. 


3. The Field Theory 

In order to ascertain the observational properties of fields described by equations of the type 
^(Pk)p ~°> which result from multiplying (2.23) by ijj*(x k '), it is necessary to investigate the 
field theory and second quantisation of fields whose Lagrangian operators have the general 
form 

x k) = F ( p k , pk)p(x h Xi ) ( 3 . 1 ) 

suggested by (2.22). If the operation of setting x w =x h is denoted by the brackets < >, then 
<(£> is the ordinary Lagrangian density of the field. Although the statistical operator has 
been defined by (2.21), there are grounds, suggested by the work of Dirac (1942), for supposing 
at p in general is not positive definite, so that p(x k> x k ) is henceforth regarded as an arbitrary 
function of x k and x k \ subject only to the Hermitian property p*(x k , x k ) = p(x k , x k ). 

The method of Heisenberg and Pauli (1929) fails to quantise fields for which F is other 
than a linear function of p k and A', and that of Chang (1946, 1948) or de Wet (1948), though 
applicable m principle to any polynomial form of F, is very cumbersome if at all applicable 
to the transcendental functions at present under consideration. The method here adopted 



Quantum Theory of Rest-Masses 475 

is a development of the procedure proposed by one of the authors (Green, 1949) in another 
paper. 

If G h and G w are defined by 

i>k) - Pk) = VkGKphPi), v-h =pk ~pk\ , v 

•F(Pk,Pk) -F(Pk>Pk)=TT h f G k Xp h pi), ?r k '=p k -p k ) 
then the field equations F(p k ,p k )p = o, F(p k ,p k )p=o, become 

(F+Tr h G*)p = o, (F+„ h 'G»)p = o. (3.3) 

Subtracting the second of the equations (3.3) from the first, one obtains, since + 7r k = o, the 

equation of continuity 

7 T k (G k + G k ')p = o (3.4) 

a 

in operational form, or 7^; < (G lc + G k ')p > = o in the more customary notation, by setting 
x k >=x k . It will be observed that if A(x ki x k ) is any operator, then 

( d d \ a 

dx^ + s^r (xi ’ xiV = 

According to (3.4), the four-vector 

R k =l(G k + G k ')p ( 3 - 5 ) 

is to be interpreted as the density-density flux vector. If the flux across a surface at infinity 
vanishes, it follows also from (3.4) that 

f=o, ( 3 - 6 ) 

where the integration extends over all real space Q. This shows that the total amount of 
matter N is conserved. 

Next, by multiplying the first of the equations (3.3) by p x and the second by p{ before 
subtraction, one has 

7 T k ( - F8 x k + G k p x + G k 'pi)p = o, ( 3 * 7 ) 

which may be interpreted as the equation of conservation of energy in operational form, the 
canonical energy-energy flux tensor being defined by 

Lf - *( - + G% + G k ’f{)p. • (3.8) 

The Hamiltonian energy-momentum vector is defined as 

■P»-J <z k *ydQ. (3-9) 

It may be noticed that the momentum density L-f so defined does not coincide with the energy 
flux guL£ unless L u is symmetrical; it will be found, however, that this condition is satisfied 
in the application to meson fields envisaged. It will now be shown that 

A=J<KA+A')* 4 ><^. C3.10) 

For k—4. this follows from the fact, seen from the field equations (3.3), that 

fp-Kp*-t*X.G k -G*)p, < 3 -«> 

so <Z 4 4 > differs from <|(^ 4 +p±)R ^> only by a term 

ii = -hX zz<(G>-G*)p>, 

which vanishes on integration over £ 1 . For £^4, <Z fc 4 > differs from < i(Pk +Pk )&*) ky a 
term J< Tr k (G± - G±)p >, which equally vanishes on integration over £2. 



M. Born and H. S. Green 


To solve the field equations (3.3), one may substitute for p the form 

P') ex P {*(P -x -Eit -p' • x' +£/)}, (3.12) 

where p, 3 (p, p'), p, p ',E { , E{ are constants; then E t and £/ are solutions of the equations 

E(^\ A W) ')=°,\ .. 

A W), =(P'^/)- J 3; 

There will in general be several roots E t and E/ of these equations, and the most general 
solution of the field equations is then obtained by summing over all possible values of i, j, p 1 
and p'. If this solution is substituted in (3.5) and (3.6), one obtains 

N^nte), 

*i(p)=r i (p)p«(p,p), ’ j (3 ‘ I4 > 

^(p)=«G 4 (A m , A (<) ) + GW\ A (i, )}._ 

Terms involving p#( p, p') with different values of p and p' obviously vanish on integration 
over Q, and those corresponding to different values of i and j must also disappear because 
according to (3.6) N does not depend on time. From (3.10) one obtains similarly 

-2^p) (3.15) 

i p 

Turning now to the special example of the meson wave field, where F may be taken to be 
a function F(p k 'p k ) of p k p k only, one sees from (3.13) that Ei = (/q 2 + p 2 )^ E/ = (/q 2 + p' 2 )*, 
where the are the roots of the equation F(k?) — o. To calculate G k , define a function 
G(z, z’) by means of 

F(z) -F(z') \ 

G(z 9 z') - t 1 l = G(z\z\ 

" ( 3 -x 6 ) 

G(*,*)=n*)=^P; 

then it follows from (3.2) that 

Wfc 6 *=(A'/*' -pk'p k )G(p k 'pk', Pk'P k ),\ ( ; 

G*=p*'G(p&*’,p k 'p*); ] (3 ' lV 


so, according to (3.14), 


r,(p)=^'(^). 


When the function F(p k p h ) has been determined, by substituting into (3.18), one obtains 
immediately the total amount of matter, momentum and energy present in the field from 
(3.14) and (3.15). This latter equation justifies the interpretation of %(p)p and %(p)i?i as 
the momentum and energy respectively associated with the amount of matter %(p) in the 
field, but the fact that p) is an integer cannot be inferred until second quantisation has been 
effected in the next section. 

The example of the spinor field may be treated similarly. F is then regarded as a function 
°f l a Jc(p k 0 alone. Defining G(z, z') again by (3.16), one obtains ihstead of (3.17), 

7r h G k =lailpjt ~~p k )G{a k p k ', la k (p k +p k %\ 

G k = la k G{a k p 1si ', la k (p*+p*')}. J (3 ‘ I9> 

The pj/ in (3.12) have now to be regarded as Dirac operators of the form 

(3.-) 

where the are numerical factors; and the equations (3.13) show that the = (E? -p 2 ) are 
the roots of the equation E(k^ = o. From (3.14) one infers that 

r,(p) = ia*E'(tmd, (3.21) 



Quantum Theory of Test-Masses 477 

which, coupled with (3.14) and (3.15), again supplies explicit expressions for the amount of 
matter, energy and momentum when the form of F is known. 


4. Second Quantisation 

To show that the wave field may be regarded as consisting of particles with definite 
energies and momenta, it is necessary to proceed to the second quantisation of the field, as 
became apparent in the previous section. The correct commutation rules have been given 
by one of the authors (Green, 1949), but it is necessary to remember that JRf as defined by (3.5), 
and not p, is the density operator for a system of many particles like that now considered. 
The commutation rule is then 


£*&=i, (/'=4 (4.1) 

where £* is a unitary “creation” operator corresponding to the solution E t of the equation 
( 3 ' J 3 )> an d the positive or negative sign is employed according as Fermi or Bose statistics 
are appropriate. Defining the operator or in such a way that 


Pi,<p> p') -^(p)U(pU-<*(pK + (p% 

= Q-^crfp) exp {*(p . x - Fit)}, 

p 

P' 

P(*k, **') = 2 a i( X k) a i + ( X k), = 4 

ij 

it is assumed that £$ commutes with o*,- (or anticommutes for Fermi statistics) if /#/, but not 
with o*. Transforming the commutation rule (4.1) to the momentum representation by 
expanding both sides in Fourier series, and remembering that S(x-x') is the unit “matrix” 
implicit on the right-hand side, one has 


,} 


(4-3) 


Ti(p){Lpa(p, p)±pu(p, p )Q=lu 

WP. P') ±Pik(P, P')U = °, jack** or p^p' 

By post-multiplication with this commutation rule reduces to the more usual form 

ri(p){cr t . + (p)o*Xp') ± erf P')^ + (P)} = StfS PP ', (4-4) 


which may equally well be adopted as the fundamental commutation rule. It maybe remarked 
that erf is not necessarily the complex conjugate or* of a iy but may also be taken to be - a*; 
it is convenient to have this sign at one’s disposal, as one can then make nf p) = T { (p) a *•(p)a*- + (p) 
positive definite independently of the sign of I\(p). With the help of the commutation rule it 
is then easily shown that % has integral eigenvalues. 

For Bose statistics one sees in succession that 


iCTiCrf, 

TiC^crf = TiOioftyi - 1) = nfai - 1), 

!><*,(*< - 1 )crf - -1 )(n { - 2) « *<(*< - i)(», - 2), etc. ^ 


(4-5) 


are all positive definite, and this can only be if is an integer in a diagonal representation. 
From (3.14) it then follows that nf p) may be interpreted as the number of particles with 
momentum p and energy E i} i,e. with rest-mass — {E - 1 - p 2 )^. It is then clear that the 
reciprocally invariant Lagrangian operators derived by the method indicated in § 2 will describe 
assemblies of particles which will later be shown to have various rest-masses in agreement 
with the experimentally determined meson masses. 

For Fermi statistics one has 

. nfi - %)—IW*+(i - P 



(4.6) 



M. Born and H. S. Green 


478 

from which it follows that o < < 1; and as n t -o is certainly a permitted value, the only 

other eigenvalue is 1. The obvious interpretation is that one and only one particle may 
occupy each momentum state with assigned energy and spin. 

It remains to complete the second quantisation of the field. By multiplying (4.4) by the 
factor exp {*'( p . x - E t i -p'. x' + Ejt)} and summing over all values of p and p', one obtains 

^+G^)[a i ^x k ) ) <*(**)]± -8(x-a08« (*'«<)■ (4-7) 


Hence with the help of (3.6) and (3.14), 

W , « W ))~=2 J < 

= 2 J 0 ^**)^* ~ x ")s«^ = Oi(x k "). 


To show similarly that 


dcr/xf) 

[P k , 


(4.8) 


( 4 - 9 ) 


it is convenient to use the unsymmetrical expression 

<P^>dQ (4.10) 

for P k instead of (3.10). The two expressions are equivalent, since they differ by 
which vanishes by transformation to a surface integral when 1, 2, 3, 
dJV 

and by the relation — =0 when £ = 4. Using (4.10), one has immediately 

CA, c^")] =<2 J < \(G* + 

which reduces to (4.9). It is easily verified now that the different components of P k commute 
with one another. The total momentum and energy P k have therefore all the properties of 

3 

the differential operators *~g, and starting from the field equations (3.3), the entire theory 
of § 3 may be transcribed into a fully quantised form. 


5. Particles with Spin Zero or One 


As was shown in § 2, the most general scalar, vector, or tensor reciprocal invariant is an 
eigensolution of the equation 


( ~ =sF ^- 


(5-i) 


The relativistically covariant solutions of this equation are obtained most readily by trans¬ 
formation to four-dimensional polar co-ordinates. It has been shown by Born and Fuchs 
( I 94 °) that the eigenfunctions of the square of the angular momentum tensor, which satisfy 
the equation 

Yfc = \m n m n Y fc = k(k + 2)Y k) (5,2) 

are four-dimensional spherical functions, and that k is a positive integer (k « o, 1, 2 .. .). The 
operator x^x* is transformed to 


Putting p 2 =P, so that 


& ( d 2 sd\ M* 

\dp* + p dp) + p*’ ^ =j5 ^ Z ' 


+3 

dp* p dp ^ dP* dP ’ 


( 5 - 3 ) 



479 


Quantum Theory of Test-Masses 

and factorising F into a radial part F k (F) and a spherical function Y k , so 


F= F*(P)Y*(0, <j>, o>), 

one has 

dP 2 P dP 4 \ P P 2 J**-' 0 - 

(S-4) 

This is solved by making the substitution 


F ^pw e -m fity 

which leads to 

d% /& + 2 \d/ k [s-2&-4\ r 
dP* + \ P ~ 1 )dP + ['~^P~f =0 > 

(5-5) 

(5-6) 

of which the known solution is 

(5-7) 

where L n F) is the Laguerre polynomial of order n. The eigenvalue s is given by the relation 

J ass 2(272 -k). 

(5-8) 

Clearly only values of n and k for which k> 0 and n> k -hi correspond to admissible solutions. 
It has been seen that the rest-masses of the particles represented by the wave equation 
are given as the roots of the equation F(/c 2 ) = 0, so that one has now 

K k ~L n k+1 (K 2 ) = 0. 

(5*9) 

The distribution of the roots of these equations (5.9) is investigated by Dr A. E. Rodriguez in 
Appendix II. The “ground state ” in the spectrum of masses so determined corresponds to 
the values h~o, n- 2, and is characterised by /c 2 = 2, k = \ / 2. Substituting the accepted 
value a~e 2 /me 2 for the electronic radius into the equation (2.4), one obtains the rest-mass 

he 

fjL=~£Km = 137 v 2 m = 194772, 

(S-io) 

where m is the electronic mass, regarded as electromagnetic in origin, 
ment with the observed values of the rest-mass of the stable ja-meson. 
corresponds to the values k = 0, 72 = 3, and requires that 

This is in good agree- 
The first excited state 

k 4 - 6 k 2 + 6 = 0, /c 2 = 3 + \/3; 

so 


he 

fi = -r(3 + V zfim = 298#?. 
e A 



This agrees well with the observed mass of the less stable 77-meson. There exists also a second 
root fc 2 =3~V3 which is associated with a particle of mass 15477Z which has not yet been 
observed. 

For /& = 2, 72 = 3, a *id generally for n-h + i, h > 2, the equation (5.9) reduces to k 2 ~o 7 
which is the characteristic equation for the photon. There exist, however, many other excited 
states in the mass-spectrum with masses corresponding to particles which have presumably 
not yet been observed, though it is possible that the supposed p-meson can be accounted for 
in this way, and also, since some of the masses are not far from 194772 and 298772, that they have 
been confused experimentally with the \i- and 7r-mesons. As these other particles are in 
excited states of rest-energy, it is, however, likely that they will have a short lifetime, decaying 
into the [jl- and w-mesons. This question can be discussed properly only by considering the 
interaction between the elementary particles. There are good grounds for supposing that 
this also can be treated quite generally by the methods of this paper, but for the present it is 
proposed to confine attention to certain aspects of the theory already developed. It has been 
seen in § 3 that the total energy of the field derived from a single .^-operator is the sum of the 



M. Born and H. S. Green 


480 

energies of the constituent particles. Therefore there can be no direct interaction between 
particles with masses derived from the same equation. Particles with masses derived from 
different equations may, however, be expected to interact with one another, leading to the 
decay processes which are experimentally observed. 


6. Identification of the Absolute Length 


There will now be given a preliminary treatment of the problem of the interaction between 
the electron and photon fields, which is the subject-matter of quantum electrodynamics. The 
difficulties associated with this subject are so well known that it is unnecessary to discuss them 
here; nor is it the object of this section to suggest a final solution. It will appear that there 
are indeed certain features of the present theory which suggest that the divergences appearing 
in current quantum electrodynamics may be eliminated, but for the present the authors are 
content to set aside this aspect. The primary question to be considered is the identification 
of the fundamental length a with a quantity almost equal to the classical radius e % jmc % of the 
electron, on which the mass determinations of the previous sections were based. 

In order to make apparent the full significance of the calculation, ordinary units will be 
employed. Maxwell’s equations for the electromagnetic field may be written in the form 

FA = ftp, p= 47 reW, (6.1) 

where F is the self-reciprocal wave operator Pe~ p ^ h ' appropriate to the photon field,, p is the 
statistical matrix for the electron, and 

A^A^+A', (6.2) 

where A k is the four-vector potential of the photon field, and A' is some operator such that 
the spur of a^A' vanishes. By multiplying (6.1) by a k and taking the quarter-spur of both 
sides, one then obtains what is effectively the ordinary form of Maxwell’s equations, since b is 
large. Writing 

Fo-=FA-Pp, (6.3) 

the field equations (6.1) can be derived in the usual way from the Lagrangian 

Z x = Jcr Fa 

= \AFA- ifi(Ap + pA) + ^fizpF^p. (6.4) 

As this Lagrangian is somewhat different from that usually adopted, it is necessary to consider 
briefly its distinctive features. The statistical matrix p of the electron, regarded as a particle 
with spin half and zero rest-mass, has the form 


p^Qr 1 


a TcP 

> 


& 

-. e W k (z k -xV){%, 


( 6 . 5 ) 


•so A' may be assumed to vanish. Then the first term on the right-hand side of (6.4) 
reduces to 


1(A V )( V A) = * (jjL - ■iF*'j 

( 5 ^)' 

WV 8A k V 

1 

= 7iw; 

7 i— a ki ,k > 

y = 


( 6 . 6 ) 


3A k 

On account of the auxiliary condition = o, which is necessary to the conservation of charge, 

(fi 6) differs from the usual Lagrangian of a pure radiation field only by the term e^F^F^ 
which, being the scalar product of the electric and magnetic field vectors, vanishes on integra¬ 
tion over all space. The second term in (6.4) is obviously the usual one representing the 
interaction between the pure radiation field and the electron. The last term is new, but can be 



Quantum Theory of Rest-Masses 481 

seen without much difficulty to be substantially equivalent to the energy of the longitudinal 
part of the electromagnetic field, reducing to the Coulomb energy for a charge distribution at 
rest, but having the property of relativistic invariance. 

To the Lagrangian (6.4) for the electromagnetic field must be added the term 


Z 2 =Zp = 47rdi 2 rjp ( 6 . 7 ) 

for the electrons. The usual variational procedure applied to the total Lagrangian Z=Z 1 *fZ 2 
leads to the field equations (6.1) for the Maxwell field, and 


(47Tcft 2 7) - pA)p + ^(pF-^p = o 


( 6 . 8 ) 


for the electron. This is a non-linear equation not easily solved exactly, but evidently it is 
practically identical with the Dirac equation provided the mass m of the electron is taken to 

WpF - 1 

be of the order of the expectation value of the operator —• , i.e. provided 


47 u 2 h 2 


.-fcY2b‘ £2( 277 ^)-3 
k 2 47 T 0 W 



e -1cV2 W^Trdk 



(6.9) 


This is the first approximation to the self-energy of the electron, and converges. It is likely 
that higher approximations will increase the value slightly, so that, as nearly as can be esti¬ 
mated, the value a—e 2 jmc 2 assumed in the last section was exact: though further calculations 
may show the need for a small correction, the meson masses thereby determined were probably 
correct within the present limits of experimental error. 


7. APPENDIX I 


The Reciprocal Invariants for Spin Half. By K. C. Cheng 

The problem of the determination of the reciprocally invariant function F(a j p r ) appropriate 
to particles of spin half will here be considered. It is first necessary to formulate the metric 
operator S(x h p 0 ) which, to comply with the requirements of the principle of reciprocity, must 
satisfy the equation 

S(x},Pi) = TS (J>i> -x,)T-\ (7.x) 

where T is a unitary operator depending only on the Dirac matrices %’s. To (7.1) must be 
added the conditions (i) that S should be relativistically invariant and linear in both x$ and pf, 
(ii) that the solutions F obtained from the metric operator S should be also the solutions of the 
equation 

(R+F+QF=Fs 2 , (7.2) 

where C, s 2 are constants; and (iii) that for integral spin the functions F should be identical 
with those obtained in § 5, i.e . in this case F must commute with C. 

S is therefore some combination of the relativistic invariants: 


y = %% 03%= —e ilmn a j aia m a n . 
4I 

It will be shown that it is sufficient to choose the simple form: 

S~rj+iy£. 



( 7 - 4 ) 



482 


M\ Born and HS. Green 


y 


1 +iy 

The condition (7.1) is thereby satisfied provided T** r, 
anti-commutes with a j} one has 


so that T~ l = ~—“?• 


TS(pCj, j>i)T~ x = 


j^(r,+i r £) 


i ~iy 
1 -i 


0+*» 2 

(1+0(1-2) 


(y+*y£) 


=(irq-£) 
=S(Ji, -x s ). 


For, since 


The wave operator F(pd\ p j ) from which the Lagrangian is to be derived will now be 
determined as the solution of the equation 


F-'SF^s, 

or 

SP-Fs, (7.5) 


where S is an eigensymbol depending only on y. For convenience in calculation define the 
following quantities:— 

L = -zcqa im il A 
m* 1 = - x l pK / 

By direct substitution it is easily verified that 

Z 2 + 4Z = 

Assume that F(p j ) is expressible in the form 

F=(S+s)¥ Je (rj i )g(d, <f>, a), 

where F fc (^ 2 ) is a function of 7 ) 2 =p 2 =p$ only, and g is an eigensymbol depending on the 
polar angles 0, <f> 3 co of the four-vector p h and satisfies the equation 

Lg=zkg, (7.9) 

so that 2k is the corresponding eigenvalue of Z. These eigenvalues and eigensymbols can be 
determined in the following way. Writing 

g=(L + 2^ + 4)Yj., (7.10) 


( 7 - 6 ) 

( 7 - 7 ) 

( 7 - 8 ) 


and using the relation (7.7), one obtains 

(Z - 2k) (Z + 2^ *f 4) Y h = 2 (m n m jl - 2k 2 - 4k) Y fc . 


( 7 .i 1) 


Hence (7.9) is satisfied if Y h is an eigenfunction and k{k + 2) is the corresponding eigenvalue 
of 

It is well known that the solutions of the equation 


- = 0 


are 


1 

7 * 


d 1 

¥ 7 * 


8p 3 dpj 

• U 4 - 




dp^dp^dp^dp^ 1 p v 


(7.12) 


( 7 - 13 ) 


where £=£ 1 +iJ 2 +£ s + ,& 4 =o, 1, 2, . . and ji 2 =p i py Putting Y ft =/ i+2 Uj. ; then Y*. is 
independent of/, and 

a 2 u* /a 2 3 a\ 

0_ a^ = \p + / ^ Y*, 



or 


Quantum Theory of Rest-Masses 


4 83 


Y *(p + j Jp) p ^ -f**b*&P Y *=o. 

Hence 

\m n m^Y k =>$(>$ + 2)Y ft . (7.14) 

The four-dimensional spherical function Y k is therefore an eigenfunction of The 

proper values of k are o, i, 2, . . ., as was found otherwise by Bom and Fuchs (1940). 

Now the components of the angular momentum tensor m n = (Xjpi - are well Imown 

to be expressible in terms of angular operators only. The same holds for Z, according to 
(7.6). Therefore 

g==(Z + 4 + 2^)Y & = Z fc (7**5) 

is a function of the angles 9 , <f>, co alone, and may therefore be regarded as a generalisation of 
the spherical harmonics for spin half. Notice that g commutes with y, therefore with 
Hence equation (7.5) becomes 

S*F k g=F k gs\ (7.16) 

Since y anticommutes with a/s, one has from (7.4) 

S 2 = rf + £ 2 + iy^rj - Tjl) = ■)] 2 +1 2 - y(Z+4). (7.17) 

Using (7.9), one obtains 

y(Z + 4)F^=Fj.^(2>5+4)y. (7.18) 


Substituting (7.17) and (7.18) into (7.16), one has 

f c 2 38 m n m jl ) t 

W 2 + ; & + + 4)y}i 

On account of the relation (7.9) and (7.7), one has 

2 m n m }l g = (Z 2 + 4Z)f=4^ + 2)^, 




and thus the equation (7.19) becomes 


IS 2 38 k(k+2) , . , 

+5 + < 4 + 2 % 


Putting R=fjp j , one obtains 




^+2) 


'j F fc= o . 


( 7 - 19 ) 

( 7 . 20 ) 

(7.2l) 


Now (7.21) is identical with (5.4), so that the eigenfunction Fj.(Z) and the eigenvalues 
(r a +4y+ 2y£) are the same as those given by (5.5), (5.7) and (5.8), namely 

Ffc=i > * / V J/2 L n i+1 ( J P), s 2 + 4y + 2yk=4n- 2 A, (7.22) 

or 

•f±=± VU" -2(1 -y)k-4y(k + 1)>. (7.23) 


The complete wave function for particles having spin half is 

v F(ij)={ n +iy£+s±} p ’ m ‘- J ‘ ,i ( 7 - 24 ) 

j± = ± ■ V{ 4 n - 2(1 - y)k - 4 y(^ +1))- (7-25) 

Lastly, we shall show that the requirements in (7.2) are fulfilled. By multiplying (7.5) 
from the left by .S', one has 

Z 2 Z=Zs 2 > (7.26) 



484 

or, in view of (7.17) and (7*18), 


M. Born and H. S. Green 


{R +P - (4 ■+ 2 k)y}F=Fs*. (7.27) 

This shows that all the three conditions are satisfied: (i) The relativistic invariance is obvious; 
(ii) the equation (7.27) has the form of (7.2) with C~ - (4 + 2k)y ; (iii) for particles of integral 
spin, y commutes with the function F determined in § 5. 

In the following we give an explicit formula of the wave operator F(rj) suitable for deter¬ 
mining the masses, which are the roots of the equation F(rj) ~ o. We note that if we multiply 
F(rj) in (7.22) from the right by (1 ±y), this function still satisfies (7.5), since y commutes 
with ±s. This process is equivalent to putting y, wherever it appears on the right side of rj 
in F(t}), equal to ±1. With this procedure, we obtain the following functions:— 

F(vj) ={ij[(i -y)L„* +1 + 2yL n k+ 2 (r) 2 )] ± 2 V{« - K 1 ~ y)k ~Y( k + I )} L « W ' 1 ( 1 7 2 )} 

■ V c Z, c (i ±y)- (7-28) 

The equation F(rj) = o gives the masses in the following cases:— 

(i) y=i> 

-k- i)L n fc+1 (^ 2 ) =0; (7.29) 

(ii) y= -i, 

*}(LV) - ± V(« + i)K k+l W =o; (7.30) 

and 7^=0 for all £ > o. 


8. APPENDIX II 


The Calculation of the Rest-Masses. By A. E. Rodriguez 

(i) It has been seen in § 5 that there are an infinite number of particles of integral spin 
with rest-masses 


foe 


(3.x) 


k 2 being given by the roots of the equation (5.9). 

The Laguerre polynomials are easily calculated with the help of the recurrence formula 


L„+x(*) - (2« +1 - x)L n (x) + n 2 L n ^(pc) = o,' 

L 0 (*) = i, L 1 (x) = i-x, 


(8.2) 


The equation (5.9) reduces, after putting k 2 =x } to 

For k — o, n = 2: x — 2 — 0, 

n = 3: a: 2 -6# 4 -6 = 0, 

n— 4: 3? — 1 2x 2 4- 36# — 24 = o, ► 

n — $\ x 4 ~2ox 3 +i2o^ 2 -240# + 120 = 0, 
n = 6: - 30# 4 4 - 300# 3 -1200# 2 4- 1800# - 720 * o J 

For k = i, n=4: # — 4 = 0, ^ 

n = 5: # 2 -10# 4 * 20 = 0, 
n = 6 : # 3 ~ i8# 2 4-90#-I20 = o. 


(3.3) 


(8.4) 


For k=2, n = 6 : x — 6 =o. 


( 8 - 5 ) 



Quantum Theory of Rest-Masses 485 

The solutions of these equations are given in the following table. 



The associated rest-masses, calculated assigning the validity of (8.1), are represented in 
fig. 1. The asymptotic distribution of these roots for large values of n has been studied by 
Tricomi (1940), who has given the formula 

x=4ri sin 2 6 n ^ (8.6) 


for the yth root of the Laguerre polynomial Lri(x), where d n ^ satisfies the transcendental 
equation 


j 4j / j 2^ \ 

2$„'J + Sm 26n',i + -^r6n’,S = [&-—^- K 


ri being related to n by ri=n -1. 

For a given value ri it follows from (8.6) and (8.7), 

dx = 8 n sm @n r j cos 0 ^jdOn'fo 

8„’i+—r)d8 n 'i- 


?*-{ 

For/=i, (8.8) can be reduced to 


2 + 2 cos 26 


* I 


dx 


1 & 

2+~ 7 

ri 2 n 

fx tf 2 "* 

2 V ri 4 ri 2 


(8.7) 


( 8 . 8 ) 


x< 4n . 


(8.9) 



M* Born and H, S. Green 


486 

This formula gives the asymptotic density of the masses for large values of corresponding 
to the highly excited states of the mass spectrum. 

(ii) For half integral spin there are two cases to consider, corresponding to the values 
±r of y in the equation (7.22). For y—1, (7.8) with (7.22) becomes 

B^rj) ^ 2 ^L/ +2 (t} 2 ) ± 2 Vn-(h + i)L w fe+1 (7j 2 )}7j i (8.10) 

which vanished identically for k — n-i) fox k = n -2 one obtains 

= 2{t]L„"(t] 2 ) ±L n n ~\rj^}r j n ~ 2 e- ri ’‘ l2 g, (8.u) 


k 



which becomes 

— 2 WK’fDfo ± (^ 2 - n)}rf^e^ 2 g ( 8 .12) 

on usmg the relation 

= L n n (^ 2 ){^] 2 - »}. ( 8 . 13 ) 

For « —2 there are only two masses, namely 


1 \ 2 

This can be interpreted as indicating a state for which only spin mesons of masses 1 and 2 
are present. 

For «=3, 4, 5, . . one obtains 


W 


ix+Vx +4« 




Quantum Theory of Rest-Masses 487 

The root which gives 77 — 0 may be regarded as representing states for which there are neutrinos 
present or electrons for charged particles by coupling with an electromagnetic field. The 
other solutions represent states for which there are spin mesons of masses 

dbi + Vi +4 n 
2 



For obtaining rest-masses for k — n- i, others than those already given, the same process 
may be applied by repeated use of the formula 

For y — - 1, (7.8) with (7.22) becomes 

F 1 (rj) = 2{rjL n *+\r j *) - 7jL^+ 2 (rj 2 ) ± V^L B * +1 (’? 8 )b .** r^g, ■. (8.15) 

which for k—n -1 reduces to 

R r (rj) = 2L n n (rf){ri ± (8.10 

For n=x there is only one mass 77 = V2! This indicates a state for which there are spin 
mesons of mass V2 present. 


0 0 


■1)! 


(*-2)! 






(8.14) 




Quantum Theory of Rest-Masses 
For «=2, 3, 4, . . one obtains 



Again this result may be regarded as representing states for which there exist neutrinos (or 
electrons for charged particles by coupling with an electromagnetic field) and spin masons of 
masses VZTi. 

For k=n- 2, (8.15) becomes, by using (8.13), 

F^(rj) = 2 L n n (7j 2 ){7) 3 ± Vn + irj 2 -(n + i)rj =F n'Jn + i}rj . n ~ 2 e r^ l2 g. (8*17) 

A similar consideration to the one given above shows that for ^ = 2 there are only spin 
mesons present, and again neutrinos (or electrons) and spin mesons for n- 3, 4, 5, ... 

The associated rest-masses for the simple cases discussed in this section, calculated assuming 
the validity of (8.1), are represented in fig. 2. 

[Remark added in proof.] Meanwhile it has been found that there is another procedure 
for forming the Lagrangian operator in a reciprocally invariant manner, which is in some 
ways preferable. It consists in assuming, instead of (2.22), 

F(x h} ocj!)=F(pi)F(pi)p{x k , x k ') } 
which leads to the same values for the rest-masses. 


REFERENCES TO LITERATURE 

Born, M., 1934. Proc. Roy . Soc ., A, cxliii, 410. 

-, 1938 a. Proc. Ind. Acad. Sci., VIII, 309. 

- , 1938 b. Proc. Roy. Soc., A, clxv, 291. 

- v 1939 - Proc. Roy. Soc. Edin A, LIX, 219. 

Born, M., and Fuchs, K., 1940. Ibid., A, lx, ioo, 141. 
Chang, T. S., 1946. Proc. Camb. Phil. Soc., XLII, 1, 32. 

-■, 1948. Ibid., XLIV, 76. 

Dirac, P. A. M., 1942. Proc. Roy. Soc., A, clxxx, 1. 

Fuchs, K., 1940. Proc. Roy. Soc. Edin., A, LX, 147. 

, 1941. Ibid., A, LXI, 26. 

Furth, R., 1929. Zeits. f. Phys., lvii, 429. 

Green, H. S., 1949. Proc. Roy. Soc., A. [In press.] 
Heisenberg, W., and Pauli, W., 1929. Zeits. f. Phys., lvi, i. 
Heisenberg, W., 1938. Ann . d. Phys., xxxii, 20. 

- , 1943. Zeits. f. Phys., CXX, 513. 

Land t, A., 1939. Phys. Rev., lvi, 482. 

March, A., 1938. Naturwiss., xxvi, 649. 

- , 1947. Acta Physica Austriaca, I, 19, 137. 

Snyder, H. S., 1947. Phys. Rev., lxxi, 38. 

Tricomi, F., 1940. Atti. d. R. Acc. di Torino, LXXVI, 1. 
de Wet, J. S., 1948. Proc. Camb. Phil. Soc., XLIV, 546. 


(Issued separately May 24, 1949) 



INDEX 


Aitken (A. C.). Studies in Practical Mathematics, 
IV. On Linear Approximation by Least Squares, 
138-146. 

-On a Problem of Correlated Errors, 273-277. 

-On the Estimation of Many Statistical Para¬ 
meters, 369-377- 

Algebras, Commutative, by D. E. Rutherford, 454-459. 

Apeiron, Definition of, and Statistics, by M, Born and 
H. W. Peng, 92-102. 

Atoms, Wave Functions for Ground States of Atoms 
Li to Ne, by W. E. Duncanson and C. A. Coulson, 
37 - 39 - 

Born (M.) and Green (H. S.). Quantum Theory of 
Rest-Masses. With Appendices by K. C. Cheng 
and A. E. Rodriguez, 470-488. 

Born (M.) and Peng (H. W.). Quantum Mechanics 
of fields. I. Pure Fields, 40-57. 

-Quantum Mechanics of Fields. II. Statistics 

of Pure Fields, 92-102. 

*-Quantum Mechanics of Fields. III. Electro¬ 

magnetic Field and Electron Field in Interaction, 
127-137. 

Bradburn (Mary), Coulson (C. A.) and Rushbrooke 
(G. S.). Graphite Crystals and Crystallites. I. 
Binding Energies in Small Crystal Layers, 336-349. 

Bruges (W. E.). Evaluation and Application of 
Certain Ladder-Type Networks, 175-186. 

Canonical Transformations: Universal Integral In¬ 
variants of Hamiltonian Systems, by Hwa-Chung 
Lee, 237-246. 

Chebyshev Polynomials, Tables of, by C. W. Jones 
J. C. P. Miller, J. F. C. Conn and R. C. Pankhurst, 
187-203. 

-- Two Numerical Applications of, by J. C. P. 

Miller, 204-210. 

Cheng (K. C.). See Bom (M.) and Green (H. S.). 

Clark (G, L.). On the Gravitational Mass of a 
System of Particles, 412-423. 

-The Equivalence of Gravitational and Invari¬ 
ant Mass of an Isolated Body at Rest, 424-426. 

-The Internal and External Fields of a Particle 

in a Gravitational Field, 427-433. 

-The Mechanics of Continuous Matter in the 

Relativity Theory, 434-441. 

Conn (J. F. C.). See Jones (C. W.), Miller (J. C. P.), 
Conn (J. F. C.) and Pankhurst (R. C.). 

Copson (E. T.). On Whittaker’s Solution of Laplace’s 
Equation, 31-36. 

Correlated Errors, Distribution of, by A, C. Aitken, 
273-277. 

Coulson (C. A.) and Duncanson (W. E.). Atomic 
Wave Functions for Ground States of Elements Li 
to Ne, 37-39. 

Coulson (C. A.) and Rushbrooke (G. S.). Graphite 
Crystals and Crystallites. II. Energies of Mobile 
Electrons in Infinite Strips, 350 - 359 - 

- See Bradburn (Mary), Coulson (C. A.) and 

Rushbrooke (G. S.). 

- See Gillam (C. M.) and Coulson (C. A.). 

Crystals, Graphite. I. Binding Energies in Small 
Layers, by Mary Bradburn, C. A. Coulson and 
G. S. Rushbrooke, 336-349. - 

-Energies of Mobile Electrons in Infinite Strips 

of, by C. A. Coulson and G. S. Rushbrooke, 350-359* 


Determinants, Continuant, by D. E. Rutherford, 
229-236. 

Difference-differential Equation with Constant Co¬ 
efficients, by E. M. Wright, 387-393. 

Diffusion, Thermal, in Some Aqueous Solutions, by 
A. C. Docherty and M. Ritchie, 297-304. 

-An Elementary Treatment of Gaseous and 

Liquid Systems, by M. Ritchie, 305-315. 

Dingle (H.). The Nature of Scientific Philosophy: 
Science regarded as a philosophy in which bare 
experiences are taken as the fundamental data and 
are correlated without regard to their normal 
association to form physical objects, 400-411. 

Discriminant of a Certain Ternary Quartic, by W. L. 
Edge, 268-272. 

Distribution of Particles, Random, by W. O. Kermack 
and P. Eggleton, 103-115. 

Docherty (A. C.) and Ritchie (M.). Thermal 
Diffusion in some Aqueous Solutions, 297-304. 

Duncanson (W. E.) and Coulson (C. A.). Atomic 
Wave Functions for Ground States of Elements Li 
to Ne, 37-39. 

“Dynamical Time”, in General Relativity, by G. C. 
McVittie, 147-155. 

Edge (W. L.). The Identification of Klein’s Quartic, 

* 83“9i- 

-The Discriminant of a Certain Ternary Quartic, 

268-272. 

Eggleton (P.) and Kermack (W. O.). A Problem in 
the Random Distribution of Particles, 103-115. 

Elements, Number determined by Spontaneous 
Fission and by ^-activity, by N. Feather, 211-220. 

Elliptic Functions, Applications to Wind Tunnel Inter¬ 
ference, by L. M. Milne-Thomson, 316-318. 

Equilibrium, Approximate Relativistic Equations of, 
by G. L. Clark, 434-441. 

Erdilyi (A.). Expansions of Lame Functions into 
Series of Legendre Functions, 247-267. 

-Transformations of Hypergeometric Functions 

of Two Variables, 378-385. 

Errors, Correlated, Problem in, by A. C. Aitken, 
273 - 277 - 

Estimation, Method of Statistical, by A. C. Aitken, 
3 6 9 - 377 . 

Etherington (I. M.H.). Non-Associative Arithmetics, 
442 - 453 - 

Factor Analysis, Problems in, by D. N. Lawley, 
394 - 399 - 

Factorial Analysis of Multiple Item Tests, by D. N. 
Lawley, 74-82. 

Feather (N.). The Number of the Elements. (Ritchie 
Lecture '), 211-220. 

Field, Gravitational, The Internal Field of a Particle in 
a, by G. L. Clark, 427-433. 

Fourier Synthesis. Two Numerical Applications of 
Chebyshev Polynomials, by J. C. P. Miller, 204-210. 

Gauss’ Theorem, Approximate form of, by G. L. 
Clark, '412-423. 

Geometry, Line-, of the Riemann Tensor, by H. S. 
Ruse, 64-73. 

Gillam (C. M.) and Coulson (C. A.). Van der Waals 
Force between a Proton and a Hydrogen Atom. 
II. Excited States, 360-368. 


489 



Index 


490 

Graphite Crystals and Crystallites. I. Binding 
Energies in Small Layers, by Mary Bradburn, 
C. A. Coulson and G. S. Rushbrooke, 33 6 ~ 349 - 1 

-Energies of Mobile Electrons in Infinite Strips 

of, by C. A. Coulson and G. S. Rushbrooke, 35 °~ 359 * 

Green (H. S.). See Born (M.) and Green (H. S.). 

Hamiltonian Systems, Universal Integral Invariants 
of, by Hwa-Chung Lee, 237-246. 

Harmonic Synthesis. Two Numerical Applications 
of Chebyshev Polynomials, by J. C. P. Miller, 
204-210. 

Hill’s Problems, by M. J. O. Strutt, 278-296. 

Houstoun (R. A.). A Measurement of the Velocity 
of Light in Water, 58-63. 

Hydrogen Atom, Van der Waals’ Force between a 
Proton and. II. Excited States, by C. M. Gillam 
and C. A. Coulson, 360-368. 

Hypergeometric Functions, Transformations of, by 
A. Erdelyi, 378-385. 

Interpolation. Two Numerical Applications of 
Chebyshev Polynomials, by J. C. P. Miller, 204-210. 


Jones (C. W.), Miller (J. C. P.), Conn (J. F. C.) and 
Pankhurst (R. C.). Tables of Chebyshev Poly¬ 
nomials, 187-203. 

Kermack (W. 0 .) and Eggleton (P.). A Problem in 
the Random Distribution of Particles, 103-115. 

Klein’s Quartic, Identification of, by W. L. Edge, 
83-91. 


Ladder Networks, Impedance and Calculation of, by 
W. E. Bruges, 175-186. 

Lame Functions, Expansions of, into Series of 
Legendre Functions, by A. Erdelyi, 247-267. 

Laplace’s Equation, Whittaker’s Solution of, by E. T. 
Copson, 31-36. 

Lawley (D. N.).—A Note on Karl Pearson’s Selection 
Formulae, 28-30. 

-The Factorial Analysis of Multiple Item Tests, 

74-82. 

-Problems in Factor Analysis, 394-399. 

Least Squares, Linear Approximation by, by A. C. 
Aitken, 138-146. 

-Studies in the Algebra of, by A. C. Aitken, 

138-146. 

Lee (Hwa-Chung). The Universal Integral Invari¬ 
ants of Hamiltonian Systems and Application to the 
Theory of Canonical Transformations, 237-246. 

Legendre Functions, Expansions of Lame Functions 
into series of, by A. Erdelyi, 247-267. 

Light, a Measurement of the Velocity of, in Water, 
by R. A. Houstoun, 58-63. 

Light-signal Coordinates, by G. C. McVittie, 147-155. 


McVittie (G. C.). Regraduation of Clocks in 
Spherically Symmetric Spaces of General Relativity, 
147 - 155 - 

Mass, Equivalent Gravitational, by G. L. Clark, 
412-423. 

-Equivalence of Gravitational and Invariant, 

by G. L. Clark, 424-426. 

Matrices, Commuting Sets of, by D. E. Rutherford, 


454 - 459 - 

Melville (H. W.). The Future of Synthetic Plastics 
{Bruce-Pre Her Address ), 1-9. 

Miller (J. C. P.). Two Numerical Applications o 
Chebyshev Polynomials, 204-210. 

- See Jones (C. W.), Miller (J. C. P.), Com 

(J. F. C.) and Pankhurst (R. C.). 

Milne (E. A.). The Fundamental Concepts of Natura 
Philosophy. (James Scott Address ), 10-24. 
Milne-Thomson, (L. M.). Applications of Ellipti 
Functions to Wind Tunnel Interference, 316-318. 


Mirsky (L.). Generalizations of a Problem of Pillai, 
460-469. 

Non-Associative Arithmetics, by I. M. PI. Etherington, 
442-453. 

Numerical Applications of Chebyshev Polynomials, by 
J. C. P. Miller, 204-210. 

Orthogonal Polynomials. Tables of Chebyshev 
Polynomials, by C. W. Jones, J. C. P, Miller, 

J. F. C. Conn and R. C. Pankhurst, 187-203. 

—-Two Numerical Applications of Chebyshev 

Polynomials, by J. C. P. Miller, 204-210. 

Pankhurst (R. C.). See Jones (C. W.), Miller 
(J. C. P.), Conn (J. F. C.) and Pankhurst (R. C.). 
Parameters, Statistical, Estimation of, by A. C. Aitken, 
369-377. 

Partitive numbers (partitioned cardinals and serials), 
by I. M. H. Etherington, 442-453. 

Peng (H, W.) and Born (M.). Quantum Mechanics 
of Fields. I. Pure Fields, 40-57. 

-II. Statistics of Pure Fields, 92-102. 

-III. Electromagnetic Field and Electron 

Field in Interaction, 127-137. 

Philosophy, Natural, The Fundamental Concepts of, 
by E. A. Milne, 10-24. 

Pillai, Generalizations of a Problem of, by L. Mirsky, 
460-469. 

Plastics, Synthetic, by H. W. Melville, 1-9. 

Quantum Mechanics of Fields. I. Pure Fields, by 
M. Born and H. W. Peng, 40-57. 

-II. Statistics of Pure Fields, by M. Born and 

H. W. Peng, 92-102. 

-III. Electromagnetic Field and Electron 

Field in Interaction, by M. Bom and FI. W. Peng, 
127-137. 

Quantum Theory of Rest-Masses, by M. Born and 
H. S. Green, 470-488. 

Random Distribution of Particles, by W. 0 . Kermack 
and P. Eggleton, 103-115. 

Regraduation of Clocks, in General Relativity, by 

G, C. McVittie, 147-155. 

Relativity, Foundations of. Parts I and II, by A. G. 
Walker, 319-335. 

-General, A Theory of Regraduation in, by 

A. G. Walker, 164-174. 

-Time-Scales in, by A. G. Walker, 221-228. 

Representation of Complex Symbols, by D. E. 
Rutherford, 25-27. 

Rest-Masses, Quantum Theory of, by M. Bom and 

H. S. Green, 470-488. 

Riemann Tensor in a Completely Harmonic V 4 , by 
H. S. Ruse, 156-163. 

-Line-Geometry of, by H. S. Ruse, 64-73. 

Ritchie (M.). An Elementary Treatment of Thermal 
Diffusion in Gaseous and Liquid Systems, 305-315. 

- See Docherty (A. C.) and Ritchie (M.). 

Rodriguez (A. E.). See Born (M.) and Green (H. S.). 
Rotating Disc, Relativistic Problem of, by G. L. 
Clark, 434-441. 

Ruse (H. S.). On the Line-Geometry of the Riemann 
Tensor, 64-73. 

-The Riemann Tensor in a Completely Har¬ 
monic V 4 , 156-163. 

Rushbrooke (G. S.). See Bradburn (Mary), Coulson 
(C. A.) and Rushbrooke (G. S.). 

-and Coulson (C. A.). See Coulson (C. A.) and 

Rushbrooke, (G. S.). 

Rutherford (D. E.). On the Matrix Representation 
of Complex Symbols, 25-27. 

-On Substitutional Equations, 117-126. 

-Some Continuant Determinants arising in 

Physics and Chemistry, 229-236. 



Index 


Rutherford (D. E.). On Commuting Matrices and 
Commutative Algebras, 454-459. 

Science, discussed as a particular kind of philosophy, 
by H. Dingle, 400-411. 

Selection, Karl Pearson's Formulae for, by D. N. 
Lawley, 28-30. 

Slot Conductors in Magnetic Cores, Impedance of, 
by W. E. Bruges, 175-186. 

Strutt, (M. J. O.). On Hill’s Problems with Complex 
Parameters and a Real Periodic Function, 278-296. 

Substitutional Equations, by D. E. Rutherford, 
117-126. 

Symbols, Complex, Matrix Representation of, by 
D. E. Rutherford, 25-27. 

Synthetic Plastics, The Future of, by H. W. Melville, 
1 - 9 - 

Tables of Chebyshev Polynomials, by C. W. Jones, 
J. C. P. Miller, J. F. C. Conn and R. C. Pankhurst, 
187-203. 

Tchebycheff Polynomials. See Chebyshev Poly¬ 
nomials. 

Ternary Quartic, Discriminant of a Certain, by W. L. 
Edge, 268-272. 


491 

Tests, Multiple Item, The Factorial Analysis of, by 
D. N. Lawley, 74-82. 

Time-Scales in Relativity, by A. G. Walker, 221-228. 

Transformations of Hypergeometric Functions of Two 
Variables, by A. Erdelyi, 378-385. 

Van der Waals Force between Hydrogen Atom and a 
Proton. II. Excited States, by C. M. Giliam and 
C. A. Coulson, 360-368. 

Velocity of Light in Water, a Measurement of, by 
R. A. Houstoun, 58-63. 

Walker (A. G.). A Theory of Regraduation in 
General Relativity, 164-174. 

-Time-Scales in Relativity, 221-228. 

-Foundations of Relativity. Parts I and II, 

3 I 9 - 33 S- 

Wave Functions, Ground States of Atoms Li to Ne > by 
W. E. Duncanson and C. A. Coulson, 37-39. 

Whittaker’s Solution of Laplace’s Equation, by E. T. 
Copson, 31-36. 

Wind Tunnel Interference, by L. M. Milne-Thomson, 
316-318, 

Wright (E. M.). The Linear Difference-differential 
Equation with Constant Coefficients, 387-393. 


PRINTED IN GREAT BRITAIN BY NEILL AND CO., LTD., EDINBURGH. 




I.A.R.I. 75 

INDIAN AGRICULTURAL RESEARCH 
INSTITUTE LIBRARY, NEW DELHI. 








